Get everything except match in javascript regular expression - javascript

I have the following regex to get the first part after a url:
^http[s]?:\/\/.*?\/([a-zA-Z-_.%]+).*$
It matches test in the below urls:
foo.com
http://foo.com
http://foo.com/test
http://foo.com/test/
http://foo.com/test?bar
What I'm now trying to do is recreate the same url, but replace test with a different value. Either by taking the parts before and after the match or reversing the result.
I'm sure there's a regexy way of doing this, but I'm unable to find out how to do so.

You can use a capturing group for part before /test and use it as back-reference in replacement:
var re = /^(https?:\/\/[^\/]+\/)[^?\/]+/gmi;
var subst = '$1foobar';
var result = str.replace(re, subst);
[^?\/]+ will match text before next / or ? after domain name in URL. As your original regex it also assumes that URLs start with http:// or https://.
RegEx Demo

Related

Getting element from filename using continous split or regex

I currently have the following string :
AAAAA/BBBBB/1565079415419-1564416946615-file-test.dsv
But I would like to split it to only get the following result (removing all tree directories + removing timestamp before the file):
1564416946615-file-test.dsv
I currently have the following code, but it's not working when the filename itselfs contains a '-' like in the example.
getFilename(str){
return(str.split('\\').pop().split('/').pop().split('-')[1]);
}
I don't want to use a loop for performances considerations (I may have lots of files to work with...) So it there an other solution (maybe regex ?)
We can try doing a regex replacement with the following pattern:
.*\/\d+-\b
Replacing the match with empty string should leave you with the result you want.
var filename = "AAAAA/BBBBB/1565079415419-1564416946615-file-test.dsv";
var output = filename.replace(/.*\/\d+-\b/, "");
console.log(output);
The pattern works by using .*/ to first consume everything up, and including, the final path separator. Then, \d+- consumes the timestamp as well as the dash that follows, leaving only the portion you want.
You may use this regex and get captured group #1:
/[^\/-]+-(.+)$/
RegEx Demo
RegEx Details:
[^\/-]+: Match any character that is not / and not -
-: Match literal -
(.+): Match 1+ of any characters
$: End
Code:
var filename = "AAAAA/BBBBB/1565079415419-1564416946615-file-test.dsv";
var m = filename.match(/[^\/-]+-(.+)$/);
console.log(m[1]);
//=> 1564416946615-file-test.dsv

What RegEx would clean up this set of inputs?

I'm trying to figure out a RegEx that would match the following:
.../string-with-no-spaces -> string-with-no-spaces
or
string-with-no-spaces:... -> string-with-no-spaces
or
.../string-with-no-spaces:... -> string-with-no-spaces
where ... can be anything in these example strings:
example.com:8080/string-with-no-spaces:latest
string-with-no-spaces:latest
example.com:8080/string-with-no-spaces
string-with-no-spaces
and a bonus would be
http://example.com:8080/string-with-no-spaces:latest
and all would match string-with-no-spaces.
Is it possible for a single RegEx to cover all those cases?
So far I've gotten as far as /\/.+(?=:)/ but that not only includes the slash, but only works for case 3. Any ideas?
Edit: Also I should mention that I'm using Node.js, so ideally the solution should pass all of these: https://jsfiddle.net/ys0znLef/
How about:
(?:.*/)?([^/:\s]+)(?::.*|$)
Consider the following solution using specific regex pattern and String.match function:
var re = /(?:[/]|^)([^/:.]+?)(?:[:][^/]|$)/,
// (?:[/]|^) - passive group, checks if the needed string is preceded by '/' or is at start of the text
// (?:[:][^/]|$) - passive group, checks if the needed string is followed by ':' or is at the end of the text
searchString = function(str){
var result = str.match(re);
return result[1];
};
console.log(searchString("example.com:8080/string-with-no-spaces"));
console.log(searchString("string-with-no-spaces:latest"));
console.log(searchString("string-with-no-spaces"));
console.log(searchString("http://example.com:8080/string-with-no-spaces:latest"));
The output for all the cases above will be string-with-no-spaces
Here's the expression I've got... just trying to tweak to use the slash but not include it.
Updated result works in JS
\S([a-zA-Z0-9.:/\-]+)\S
//works on regexr, regex storm, & regex101 - tested with a local html file to confirm JS matches strings
var re = /\S([a-zA-Z0-9.:/\-]+)\S/;

Simple Nodejs Regex: Extract text from between two strings

I'm trying to extract the Vine ID from the following URL:
https://vine.co/v/Mipm1LMKVqJ/embed
I'm using this regex:
/v/(.*)/
and testing it here: http://regexpal.com/
...but it's matching the V and closing "/". How can I just get "Mipm1LMKVqJ", and what would be the cleanest way to do this in Node?
You need to reference the first match group in order to print the match result only.
var re = new RegExp('/v/(.*)/');
var r = 'https://vine.co/v/Mipm1LMKVqJ/embed'.match(re);
if (r)
console.log(r[1]); //=> "Mipm1LMKVqJ"
Note: If the url often change, I recommend using *? to prevent greediness in your match.
Although from the following url, maybe consider splitting.
var r = 'https://vine.co/v/Mipm1LMKVqJ/embed'.split('/')[4]
console.log(r); //=> "Mipm1LMKVqJ"

Regex to get a specific query string variable in a URL

I have a URL like
server/area/controller/action/4/?param=2"
in which the server can be
http://localhost/abc
https://test.abc.com
https://abc.om
I want to get the first character after "action/" which is 4 in the above URL, with a regex. Is it possible with regex in js, or is there any way?
Use regex \d+(?=\/\?)
var url = "server/area/controller/action/4/?param=2";
var param = url.match(/\d+(?=\/\?)/);
Test code here.
Using this regex in JavaScript:
action/(.)
Allows you to access the first matching group, which will contain the first character after action/ -- see the examples at JSFiddle
This way splits the URL on the / characters and extracts the last but one element
var url = "server/area/controller/action/4/?param=2".split ('/').slice (-2,-1)[0];

Regex: Getting content from URL

I want to get "the-game" using regex from URLs like
http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/another-one/another-one/
http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/another-one/
http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/
What parts of the URL could vary and what parts are constant? The following regex will always match whatever is in the slashes following "/en/" - the-game in your example.
(?<=/en/).*?(?=/)
This one will match the contents of the 2nd set of slashes of any URL containing "webdev", assuming the first set of slashes contains a 2 or 3 character language code.
(?<=.*?webdev.*?/.{2,3}/).*?(?=/)
Hopefully you can tweak these examples to accomplish what you're looking for.
var myregexp = /^(?:[^\/]*\/){4}([^\/]+)/;
var match = myregexp.exec(subject);
if (match != null) {
result = match[1];
} else {
result = "";
}
matches whatever lies between the fourth and fifth slash and stores the result in the variable result.
You probably should use some kind of url parsing library rather than resorting to using regex.
In python:
from urlparse import urlparse
url = urlparse('http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/another-one/another-one/')
print url.path
Which would yield:
/en/the-game/another-one/another-one/another-one/
From there, you can do simple things like stripping /en/ from the beginning of the path. Otherwise, you're bound to do something wrong with a regular expression. Don't reinvent the wheel!

Categories

Resources