Better RegEx to extract GoogleVideo ID from URL

Better RegEx to extract GoogleVideo ID from URL - javascript

HI!
I use this the following regex with JS to extract this id 6321890784249785097 from that url
http://video.google.com/googleplayer.swf?docId=6321890784249785097
url.replace(/^[^\$]+.(.{19}).*/,"$1");
But I only cut the last 19 chars from the tail. How can I make to more bullet-proof? Maybe with an explanation so that I learn something?

This should work a bit better:
/^.*docId=(\d+)$/
This matches all characters up to the 'docId=', then gives you all digits after that up to the end of the url.

video[.]google[.]com/googleplayer[.]swf[?]docId=(\d+)
The ID will be captured in reference #1. If you just want to match 19 digits you can chance it to this:
video[.]google[.]com/googleplayer[.]swf[?]docId=(\d{19})

url.replace(/.*docId=(\d{19}).*/i,"$1");
this cuts 19 digits that follow docId=.

Here is the function I use in our app to read url parameters. So far it didn't let me down ;)
urlParam:function(name, w){
w = w || window;
var rx = new RegExp('[\&|\?]'+name+'=([^\&\#]+)'),
val = w.location.href.match(rx);
return !val ? '':val[1];
}
For the explanation of the regexp:
[\&|\?] take either the start of the query string '?' or the separation between parameters '&'
'name' will be the name of the parameter 'docId' in your case
([^\&#]+) take any characters that are not & and #. The hash key is often used in one page apps. And the parenthesis keep the reference of the content.
val will be an array or null/undefined and val[1] the value you are looking for

Related

javascript regex insert new element into expression

I am passing a URL to a block of code in which I need to insert a new element into the regex. Pretty sure the regex is valid and the code seems right but no matter what I can't seem to execute the match for regex!
//** Incoming url's
//** url e.g. api/223344
//** api/11aa/page/2017
//** Need to match to the following
//** dir/api/12ab/page/1999
//** Hence the need to add dir at the front
var url = req.url;
//** pass in: /^\/api\/([a-zA-Z0-9-_~ %]+)(?:\/page\/([a-zA-Z0-9-_~ %]+))?$/
var re = myregex.toString();
//** Insert dir into regex: /^dir\/api\/([a-zA-Z0-9-_~ %]+)(?:\/page\/([a-zA-Z0-9-_~ %]+))?$/
var regVar = re.substr(0, 2) + 'dir' + re.substr(2);
var matchedData = url.match(regVar);
matchedData === null ? console.log('NO') : console.log('Yay');
I hope I am just missing the obvious but can anyone see why I can't match and always returns NO?
Thanks

Let's break down your regex
^\/api\/ this matches the beginning of a string, and it looks to match exactly the string "/api"
([a-zA-Z0-9-_~ %]+) this is a capturing group: this one specifically will capture anything inside those brackets, with the + indicating to capture 1 or more, so for example, this section will match abAB25-_ %
(?:\/page\/([a-zA-Z0-9-_~ %]+)) this groups multiple tokens together as well, but does not create a capturing group like above (the ?: makes it non-captuing). You are first matching a string exactly like "/page/" followed by a group exactly like mentioned in the paragraph above (that matches a-z, A-Z, 0-9, etc.
?$ is at the end, and the ? means capture 0 or more of the precending group, and the $ matches the end of the string
This regex will match this string, for example: /api/abAB25-_ %/page/abAB25-_ %
You may be able to take advantage of capturing groups, however, and use something like this instead to get similar results: ^\/api\/([a-zA-Z0-9-_~ %]+)\/page\/\1?$. Here, we are using \1 to reference that first capturing group and match exactly the same tokens it is matching. EDIT: actually, this probably won't work, since the text after /api/ and the text after /page/ will most likely be different, carrying on...
Afterwards, you are are adding "dir" to the beginning of your search, so you can now match someting like this: dir/api/abAB25-_ %/page/abAB25-_ %
You have also now converted the regex to a string, so like Crayon Violent pointed out in their comment, this will break your expected funtionality. You can fix this by using .source on your regex: var matchedData = url.match(regVar.source); https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/source
Now you can properly match a string like this: dir/api/11aa/page/2017 see this example: https://repl.it/Mj8h

As mentioned by Crayon Violent in the comments, it seems you're passing a String rather than a regular expression in the .match() function. maybe try the following:
url.match(new RegExp(regVar, "i"));
to convert the string to a regular expression. The "i" is for ignore case; don't know that's what you want. Learn more here:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp

Use only one of the characters in regular expression javascript

I guess that should be smth very easy, but I'm stuck with that for at least 2 hours and I think it's better to ask the question here.
So, I've got a reg expression /&t=(\d*)$/g and it works fine while it is not ?t instead of &t in url. I've tried different combinations like /\?|&t=(\d*)$/g ; /\?t=(\d*)$|/&t=(\d*)$/g ; /(&|\?)t=(\d*)$/g and various others. But haven't got the expected result which is /\?t=(\d*)$/g or /&t=(\d*)$/g url part (whatever is placed to input).
Thx for response. I think need to put some details here. I'm actually working on this peace of code
var formValue = $.trim($("#v").val());
var formValueTime = /&t=(\d*)$/g.exec(formValue);
if (formValueTime && formValueTime.length > 1) {
formValueTime = parseInt(formValueTime[1], 10);
formValue = formValue.replace(/&t=\d*$/g, "");
}
and I want to get the t value whether reference passed with &t or ?t in references like youtu.be/hTWKbfoikeg?t=82 or similar one youtu.be/hTWKbfoikeg&t=82

To replace, you may use
var formValue = "some?some=more&t=1234"; // $.trim($("#v").val());
var formValueTime;
formValue = formValue.replace(/[&?]t=(\d*)$/g, function($0,$1) {
formValueTime = parseInt($1,10);
return '';
});
console.log(formValueTime, formValue);
To grab the value, you may use
/[?&]t=(\d*)$/g.exec(formValue);
Pattern details
[?&] - a character class matching ? or &
t= - t= substring
(\d*) - Group 1 matching zero or more digits
$ - end of string

/\?t=(\d*)|\&t=(\d*)$/g
you inverted the escape character for the second RegEx.
http://regexr.com/3gcnu

I want to thank you all guys for trying to help. Special thanks to #Wiktor Stribiżew who gave the closest answer.
Now the piece of code I needed looks exactly like this:
/[?&]t=(\d*)$/g.exec(formValue);
So that's the [?&] part that solved the problem.
I use array later, so /\?t=(\d*)|\&t=(\d*)$/g doesn't help because I get an array like [t&=50,,50] when reference is & type and the correct answer [t?=50,50] when reference is ? type just because of the order of statements in RegExp.
Now, if you're looking for a piece of RegExp that picks either character in one place while the rest of RegExp remains the same you may use smth like this [?&] for the example where wanted characters are ? and &.

Any way to extract string within 2 different special characters using javascript?

Hi I have a varying URL similar to:
http://farm4.staticflickr.com/3877/[image_id]_[secret].jpg
e.g. http://farm4.staticflickr.com/3877/14628998490_233a15c423_q.jpg
I need to extract image_id that's first set of numbers (i.e. 14628998490) before an underscore from 14628998490_233a15c423_q.jpg between the whole URL
Is there a good way to extract image_id?
Right now I am going to use:
var image_id = image_url.match(/[\/]([0-9]+)_/)[1]

Like i said in the comment, you don't need to escape / symbol in the character class. And also you don't need even a character class also. Just \/ would be enough. The below regex would capture one or more numbers which are preceded by / symbol and followed by _ symbol.
\/(\d+)_
DEMO
> var image_id = image_url.match(/\/(\d+)_/)[1]
undefined
> image_id
'14628998490'
OR
You could try this also, if you don't want to give \d+ in your pattern.
\/([^/]*?)_
DEMO
> var image_id = image_url.match(/\/([^/]*?)_/)[1]
undefined
> image_id
'14628998490'

Not shure that it's is better way, but you can do like this:
var str = 'http://farm4.staticflickr.com/3877/[image_id]_[secret].jpg';
var image_id = str.split('/').pop().split('.')[0].split('_');

If the special character is always the same (_), you could first obtain the last part (width substring+lastIndexOf) and then use split() :
var url = "http://farm4.staticflickr.com/3877/14628998490_233a15c423_q.jpg";
var splittedUrl = url.substr(url.lastIndexOf('/')+1).split("_");
var image_id = splittedUrl[0];
console.log(image_id);
I've read somewhere that string functions are faster than regexp, so it's an option you might consider.

String splitting is faster tha regex.You can just get the last index of / and string between first occurence of _ after last occurence of /. I think that will be better idea.

How to find in javascript with regular expression string from url?

Good evening, How can I find in javascript with regular expression string from url address for example i have url: http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/ and I need only string between last slashes (/ /) http://something.cz/something/string/ in this example word that i need is mikronebulizer. Thank you very much for you help.

You could use a regex match with a group.
Use this:
/([\w\-]+)\/$/.exec("http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/")[1];
Here's a jsfiddle showing it in action
This part: ([\w\-]+)
Means at least 1 or more of the set of alphanumeric, underscore and hyphen and use it as the first match group.
Followed by a /
And then finally the: $
Which means the line should end with this
The .exec() returns an array where the first value is the full match (IE: "mikronebulizer/") and then each match group after that.
So .exec()[1] returns your value: mikronebulizer

Simply:
url.match(/([^\/]*)\/$/);
Should do it.
If you want to match (optionally) without a trailing slash, use:
url.match(/([^\/]*)\/?$/);
See it in action here: http://regex101.com/r/cL3qG3

If you have the url provided, then you can do it this way:
var url = 'http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/';
var urlsplit = url.split('/');
var urlEnd = urlsplit[urlsplit.length- (urlsplit[urlsplit.length-1] == '' ? 2 : 1)];
This will match either everything after the last slash, if there's any content there, and otherwise, it will match the part between the second-last and the last slash.

Something else to consider - yes a pure RegEx approach might be easier (heck, and faster), but I wanted to include this simply to point out window.location.pathName.
function getLast(){
// Strip trailing slash if present
var path = window.location.pathname.replace(/\/$?/, '');
return path.split('/').pop();
}

Alternatively you could get using split:
var pieces = "http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/".split("/");
var lastSegment = pieces[pieces.length - 2];
// lastSegment == mikronebulizer

var url = 'http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/';
if (url.slice(-1)=="/") {
url = url.substr(0,url.length-1);
}
var lastSegment = url.split('/').pop();
document.write(lastSegment+"<br>");

Find and get only number in string

Please help me solve this strange situation:
Here is code:
The link is so - www.blablabla.ru#3
The regex is so:
var id = window.location.href.replace(/\D/, '' );
alert(id);
The regular expression is correct - it must show only numbers ... but it's not showing numbers :-(
Can you please advice me and provide some informations on how to get only numbers in the string ?
Thanks

You're replacing only the first non-digit character with empty string. Try using:
var id = window.location.href.replace(/\D+/g, '' ); alert(id);
(Notice the "global" flag at the end of regex).

Consider using location.hash - this holds just the hashtag on the end of the url: "#42".
You can write:
var id = location.hash.substring(1);

Edit: See Kobi's answer. If you really are using the hash part of things, just use location.hash! (To self: Doh!)
But I'll leave the below in case you're doing something more complex than your example suggests.
Original answer:
As the others have said, you've left out the global flag in your replacement. But I'm worried about the expression, it's really fragile. Consider: www.37signals.com#42: Your resulting numeric string will be 3742, which probably isn't what you want. Other examples: www.blablabla.ru/user/4#3 (43), www2.blablabla.ru#3 (23), ...
How 'bout:
id = window.location.href.match(/\#(\d+)/)[1];
...which gets you the contiguous set of digits immediately following the hash mark (or undefined if there aren't any).

Use the flag /\D/g, globally replace all the instances
var id = window.location.href.replace(/\D/g, '' );
alert(id);
And /\D+/ gets better performance than /\D/g, according to Justin Johnson, which I think because of \D+ can match and replace it in one shot.

Develop Reference

JavaScript is the programming language of the Web.

Better RegEx to extract GoogleVideo ID from URL - javascript

This should work a bit better: /^.*docId=(\d+)$/ This matches all characters up to the 'docId=', then gives you all digits after that up to the end of the url.

video[.]google[.]com/googleplayer[.]swf[?]docId=(\d+) The ID will be captured in reference #1. If you just want to match 19 digits you can chance it to this: video[.]google[.]com/googleplayer[.]swf[?]docId=(\d{19})

url.replace(/.docId=(\d{19})./i,"$1"); this cuts 19 digits that follow docId=.

Related

javascript regex insert new element into expression

Use only one of the characters in regular expression javascript

Any way to extract string within 2 different special characters using javascript?

How to find in javascript with regular expression string from url?

Find and get only number in string

Categories

Resources

Develop Reference

JavaScript is the programming language of the Web.

Better RegEx to extract GoogleVideo ID from URL - javascript

This should work a bit better: /^.*docId=(\d+)$/ This matches all characters up to the 'docId=', then gives you all digits after that up to the end of the url.

video[.]google[.]com/googleplayer[.]swf[?]docId=(\d+) The ID will be captured in reference #1. If you just want to match 19 digits you can chance it to this: video[.]google[.]com/googleplayer[.]swf[?]docId=(\d{19})

url.replace(/.*docId=(\d{19}).*/i,"$1"); this cuts 19 digits that follow docId=.

Related

javascript regex insert new element into expression

Use only one of the characters in regular expression javascript

Any way to extract string within 2 different special characters using javascript?

How to find in javascript with regular expression string from url?

Find and get only number in string

Categories

Resources

url.replace(/.docId=(\d{19})./i,"$1"); this cuts 19 digits that follow docId=.