Getting cannot read property '0' of null in a regex - javascript

I want to get the filename from an url. I'm doing this:
console.log(url)
const filename = url.match(/([^\/]+)(?=\.\w+$)/)[0]
return filename
But I get TypeError: Cannot read property '0' of null. Which is strange, because console.log(url) outputs the file: http://ac-0uhksb6K.clouddn.com/9710016c8dfcf6ae1e9d.jpg?imageView2/2/w/4096/h/2048/q/100/format/jpg
What could be the problem?

Your regex expects the filename with its extension to be the last part of a URL, while the URL in the question has also query parameters. In order to take these into account add the (?:$|\?) alternation to the lookahead:
[^\/]+(?=\.\w+(?:$|\?))
Note: the capture group is redundant so removed it as well.
Demo: https://regex101.com/r/nX2rP5/3

If there are no matches, then there is no array on which to select the first item with [0].
You'll want to check the array has at least one item first...

The error occurs because there is no match, see your regex demo. The issue is that the lookahead (?=\.\w+$) requires the .+1 or more word chars before the end of string $. You need to allow checking the ? query string start marker, too.
NOTE that you actually do not have to use lookarounds at all. Use a capturing group - ([^\/]+)\.\w+(?:\?|$) and access [1] item.
See the regex demo
Also, it is always a good idea to check if a match occurred at all before accessing capture groups.
var re = /([^\/]+)\.\w+(?:\?|$)/;
var str = 'file: http://ac-0uhksb6K.clouddn.com/9710016c8dfcf6ae1e9d.jpg?imageView2/2/w/4096/h/2048/q/100/format/jpg';
var match = str.match(re);
if (match) {
console.log(match[1]);
}

Related

javascript regex insert new element into expression

I am passing a URL to a block of code in which I need to insert a new element into the regex. Pretty sure the regex is valid and the code seems right but no matter what I can't seem to execute the match for regex!
//** Incoming url's
//** url e.g. api/223344
//** api/11aa/page/2017
//** Need to match to the following
//** dir/api/12ab/page/1999
//** Hence the need to add dir at the front
var url = req.url;
//** pass in: /^\/api\/([a-zA-Z0-9-_~ %]+)(?:\/page\/([a-zA-Z0-9-_~ %]+))?$/
var re = myregex.toString();
//** Insert dir into regex: /^dir\/api\/([a-zA-Z0-9-_~ %]+)(?:\/page\/([a-zA-Z0-9-_~ %]+))?$/
var regVar = re.substr(0, 2) + 'dir' + re.substr(2);
var matchedData = url.match(regVar);
matchedData === null ? console.log('NO') : console.log('Yay');
I hope I am just missing the obvious but can anyone see why I can't match and always returns NO?
Thanks
Let's break down your regex
^\/api\/ this matches the beginning of a string, and it looks to match exactly the string "/api"
([a-zA-Z0-9-_~ %]+) this is a capturing group: this one specifically will capture anything inside those brackets, with the + indicating to capture 1 or more, so for example, this section will match abAB25-_ %
(?:\/page\/([a-zA-Z0-9-_~ %]+)) this groups multiple tokens together as well, but does not create a capturing group like above (the ?: makes it non-captuing). You are first matching a string exactly like "/page/" followed by a group exactly like mentioned in the paragraph above (that matches a-z, A-Z, 0-9, etc.
?$ is at the end, and the ? means capture 0 or more of the precending group, and the $ matches the end of the string
This regex will match this string, for example: /api/abAB25-_ %/page/abAB25-_ %
You may be able to take advantage of capturing groups, however, and use something like this instead to get similar results: ^\/api\/([a-zA-Z0-9-_~ %]+)\/page\/\1?$. Here, we are using \1 to reference that first capturing group and match exactly the same tokens it is matching. EDIT: actually, this probably won't work, since the text after /api/ and the text after /page/ will most likely be different, carrying on...
Afterwards, you are are adding "dir" to the beginning of your search, so you can now match someting like this: dir/api/abAB25-_ %/page/abAB25-_ %
You have also now converted the regex to a string, so like Crayon Violent pointed out in their comment, this will break your expected funtionality. You can fix this by using .source on your regex: var matchedData = url.match(regVar.source); https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/source
Now you can properly match a string like this: dir/api/11aa/page/2017 see this example: https://repl.it/Mj8h
As mentioned by Crayon Violent in the comments, it seems you're passing a String rather than a regular expression in the .match() function. maybe try the following:
url.match(new RegExp(regVar, "i"));
to convert the string to a regular expression. The "i" is for ignore case; don't know that's what you want. Learn more here:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp

Removing a letters located between to specific string

I want to make sure that the URL I get from window.location does not already contain a specific fragment identifier already. If it does, I must remove it. So I must search the URL, and find the string that starts with mp- and continues until the end URL or the next # (Just in case the URL contains more than one fragment identifier).
Examples of inputs and outputs:
www.site.com/#mp-1 --> www.site.com/
www.site.com#mp-1 --> www.site.com
www.site.com/#mp-1#pic --> www.site.com/#pic
My code:
(that obviously does not work correctly)
var url = window.location;
if(url.toLowerCase().indexOf("#mp-") >= 0){
var imgString = url.substring(url.indexOf('#mp-') + 4,url.indexOf('#'));
console.log(imgString);
}
Any idea how to do it?
Something like this? This uses a regular expression to filter the unwanted string.
var inputs = [
"www.site.com/#mp-1",
"www.site.com#mp-1",
"www.site.com/#mp-1#pic"
];
inputs = inputs.map(function(input) {
return input.replace(/#mp-1?/, '');
});
console.log(inputs);
Output:
["www.site.com/", "www.site.com", "www.site.com/#pic"]
jsfiddle: https://jsfiddle.net/tghuye75/
The regex I used /#mp-1?/ removes any strings like #mp- or #mp-1. For a string of unknown length until the next hashtag, you can use /#mp-[^#]* which removes #mp-, #mp-1, and #mp-somelongstring.
Use regular expressions:
var url = window.location;
var imgString = url.replace(/(#mp-[^#\s]+)/, "");
It removes from URL hash anything from mp- to the char before #.
Regex101 demo
You can use .replace to replace a regular expression matching ("#mp-" followed by 0 or more non-# characters) with the empty string. If it's possible there are multiple segments you want to remove, just add a g flag to the regex.
url = url.replace(/#mp-[^#]*/, '');
The window.location has the hash property so... window.location.hash
The most primitive way is to declare
var char_start, char_end
and find two "#" or one and the 2nd will be end of input.
with that... you can do what you want, the change of window.location.hash will normally affect the browser adress.
Good luck!

Javascript Regex match everything after last occurrence of string

I am trying to match everything after (but not including!) the last occurrence of a string in JavaScript.
The search, for example, is:
[quote="user1"]this is the first quote[/quote]\n[quote="user2"]this is the 2nd quote and some url https://www.google.com/[/quote]\nThis is all the text I\'m wirting about myself.\n\nLook at me ma. Javascript.
Edit: I'm looking to match everything after the last quote block. So I was trying to match everything after the last occurrence of "quote]" ? Idk if this is the best solution but its what i've been trying.
I'll be honest, i suck at this Regex stuff.. here is what i've been trying with the results..
regex = /(quote\].+)(.*)/ig; // Returns null
regex = /.+((quote\]).+)$/ig // Returns null
regex = /( .* (quote\]) .*)$/ig // Returns null
I have made a JSfiddle for anyone to have a play with here:
https://jsfiddle.net/au4bpk0e/
One option would be to match everything up until the last [/quote], and then get anything following it. (example)
/.*\[\/quote\](.*)$/i
This works since .* is inherently greedy, and it will match every up until the last \[\/quote\].
Based on the string you provided, this would be the first capturing group match:
\nThis is all the text I\'m wirting about myself.\n\nLook at me ma. Javascript.
But since your string contains new lines, and . doesn't match newlines, you could use [\s\S] in place of . in order to match anything.
Updated Example
/[\s\S]*\[\/quote\]([\s\S]*)$/i
You could also avoid regex and use the .lastIndexOf() method along with .slice():
Updated Example
var match = '[\/quote]';
var textAfterLastQuote = str.slice(str.lastIndexOf(match) + match.length);
document.getElementById('res').innerHTML = "Results: " + textAfterLastQuote;
Alternatively, you could also use .split() and then get the last value in the array:
Updated Example
var textAfterLastQuote = str.split('[\/quote]').pop();
document.getElementById('res').innerHTML = "Results: " + textAfterLastQuote;

using a lookahead to get the last occurrence of a pattern in javascript

I was able to build a regex to extract a part of a pattern:
var regex = /\w+\[(\w+)_attributes\]\[\d+\]\[own_property\]/g;
var match = regex.exec( "client_profile[foreclosure_defenses_attributes][0][own_property]" );
match[1] // "foreclosure_defenses"
However, I also have a situation where there will be a repetitive pattern like so:
"client_profile[lead_profile_attributes][foreclosure_defenses_attributes][0][own_property]"
In that case, I want to ignore [lead_profile_attributes] and just extract the portion of the last occurence as I did in the first example. In other words, I still want to match "foreclosure_defenses" in this case.
Since all patterns will be like [(\w+)_attributes], I tried to do a lookahead, but it is not working:
var regex = /\w+\[(\w+)_attributes\](?!\[(\w+)_attributes\])\[\d+\]\[own_property\]/g;
var match = regex.exec("client_profile[lead_profile_attributes][foreclosure_defenses_attributes][0][own_property]");
match // null
match returns null meaning that my regex isn't working as expected. I added the following:
\[(\w+)_attributes\](?!\[(\w+)_attributes\])
Because I want to match only the last occurrence of the following pattern:
[lead_profile_attributes][foreclosure_defenses_attributes]
I just want to grab the foreclosure_defenses, not the lead_profile.
What might I be doing wrong?
I think I got it working without positive lookahead:
regex = /(\[(\w+)_attributes\])+/
/(\[(\w+)_attributes\])+/
match = regex.exec(str);
["[a_attributes][b_attributes][c_attributes]", "[c_attributes]", "c"]
I was able to also achieve it through noncapturing groups. Output from chrome console:
var regex = /(?:\w+(\[\w+\]\[\d+\])+)(\[\w+\])/;
undefined
regex
/(?:\w+(\[\w+\]\[\d+\])+)(\[\w+\])/
str = "profile[foreclosure_defenses_attributes][0][properties_attributes][0][other_stuff]";
"profile[foreclosure_defenses_attributes][0][properties_attributes][0][other_stuff]"
match = regex.exec(str);
["profile[foreclosure_defenses_attributes][0][properties_attributes][0][other_stuff]", "[properties_attributes][0]", "[other_stuff]"]

How to find in javascript with regular expression string from url?

Good evening, How can I find in javascript with regular expression string from url address for example i have url: http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/ and I need only string between last slashes (/ /) http://something.cz/something/string/ in this example word that i need is mikronebulizer. Thank you very much for you help.
You could use a regex match with a group.
Use this:
/([\w\-]+)\/$/.exec("http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/")[1];
Here's a jsfiddle showing it in action
This part: ([\w\-]+)
Means at least 1 or more of the set of alphanumeric, underscore and hyphen and use it as the first match group.
Followed by a /
And then finally the: $
Which means the line should end with this
The .exec() returns an array where the first value is the full match (IE: "mikronebulizer/") and then each match group after that.
So .exec()[1] returns your value: mikronebulizer
Simply:
url.match(/([^\/]*)\/$/);
Should do it.
If you want to match (optionally) without a trailing slash, use:
url.match(/([^\/]*)\/?$/);
See it in action here: http://regex101.com/r/cL3qG3
If you have the url provided, then you can do it this way:
var url = 'http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/';
var urlsplit = url.split('/');
var urlEnd = urlsplit[urlsplit.length- (urlsplit[urlsplit.length-1] == '' ? 2 : 1)];
This will match either everything after the last slash, if there's any content there, and otherwise, it will match the part between the second-last and the last slash.
Something else to consider - yes a pure RegEx approach might be easier (heck, and faster), but I wanted to include this simply to point out window.location.pathName.
function getLast(){
// Strip trailing slash if present
var path = window.location.pathname.replace(/\/$?/, '');
return path.split('/').pop();
}
Alternatively you could get using split:
var pieces = "http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/".split("/");
var lastSegment = pieces[pieces.length - 2];
// lastSegment == mikronebulizer
var url = 'http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/';
if (url.slice(-1)=="/") {
url = url.substr(0,url.length-1);
}
var lastSegment = url.split('/').pop();
document.write(lastSegment+"<br>");

Categories

Resources