get first url segment with a javascript regex - javascript

I have this regexp to extract instagram.com usernames
const matches = value.match(
/^(?:#|(?:https?:\/\/)?(?:www\.)?instagr(?:\.am|am\.com)\/)?(\w+)\/?$/
);
console.log(matches[1])
It works fine with www.instagram.com/username but it doesn't work with www.instagram.com/username?ref=google
How can I exact only the username from the url?
Thanks for your help.

alternatively, do not use regex. e.g.
const url = "www.instagram.com/username?ref=google";
const oUrl = new URL("http://" + url);
console.log(oUrl.pathname.substring(1));
or
let url = "instagram.com/username?ref=google";
if (!url.startsWith("http://") || !url.startsWith("https://")) {
url = "http://" + url;
}
const oUrl = new URL(url);
console.log(oUrl.pathname.substring(1));

The $ at the end matches the end of the line, but the end of the username isn't necessarily the end of the line. To permit ? after the username as well, use (?:$|\?) instead of $:
^(?:#|(?:https?:\/\/)?(?:www\.)?instagr(?:\.am|am\.com)\/)?(\w+)\/?(?:$|\?)
https://regex101.com/r/pbpi74/1

You can also try a non-regex way using .substring. You may find it cleaner than regex.
It works with both URLs.
let username = url.substring(
url.lastIndexOf("/") + 1,
url.lastIndexOf("?") > 0? url.lastIndexOf("?") : url.length
);
Check fiddle:
https://jsfiddle.net/whx5otvp/

Related

Validating url with http:www as optional using regex

I am trying to validate a url where the "http:www" is optional, so the yahoo.com and http://www.yahoo.com needs to be valid url but using the following regex does not take utl3 to be valid one .
How can I fix this ??
function checkUrlTest(url){
var urlregex = new RegExp("^(https?:\/\/www\.)?(^(https?:\/\/www\.)[0-9A-Za-z]+\.+[a-z]{2,5})");
return urlregex.test(url);
}
url3 = "yahoo.com";
url4 = "www.yahoo.com";
alert(checkUrlTest(url3));
(http://)?(www\.)?[A-Za-z0-9]+\.[a-z]{2,3}
In this regex, http://www.yahoo.com, http://yahoo.com and www.yahoo.com are all valid URLs
Just check it out. All problems will resolve.
var rgx = /^\s*(http\:\/\/)?([a-z\d\-]{1,63}\.)*[a-z\d\-]{1,255}\.[a-z]{2,6}\s*$/;
Working Demo http://jsfiddle.net/fy66p/
Solution reside here: Negative Lookahead: http://www.regular-expressions.info/lookaround.html#lookahead with the www case and you should get what you are looking for. Lemme know how it goes!
Hope it fits your needs :)
code
function checkUrlTest(url){
// Try this
var urlregex = new RegExp("^(?!www | www\.)[A-Za-z0-9_-]+\.+[A-Za-z0-9.\/%&=\?_:;-]+$")
return urlregex.test(url);
}
url3 = "yahoo.com";
url4 = "www.yahoo.com";
alert('===> ' + checkUrlTest(url4) + '===> ' + checkUrlTest(url3));
function validateUrl(value)
{
var regexp = /(ftp|http|https):\/\/(\w+:{0,1}\w*#)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%#!\-\/]))?/
return regexp.test(value);
}
if not try this:
(?i)\b((?:(?:[a-z][\w-]+:)?(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))

javascript/jquery add trailing slash to url (if not present)

I'm making a small web app in which a user enters a server URL from which it pulls a load of data with an AJAX request.
Since the user has to enter the URL manually, people generally forget the trailing slash, even though it's required (as some data is appended to the url entered). I need a way to check if the slash is present, and if not, add it.
This seems like a problem that jQuery would have a one-liner for, does anyone know how to do this or should I write a JS function for it?
var lastChar = url.substr(-1); // Selects the last character
if (lastChar != '/') { // If the last character is not a slash
url = url + '/'; // Append a slash to it.
}
The temporary variable name can be omitted, and directly embedded in the assertion:
if (url.substr(-1) != '/') url += '/';
Since the goal is changing the url with a one-liner, the following solution can also be used:
url = url.replace(/\/?$/, '/');
If the trailing slash exists, it is replaced with /.
If the trailing slash does not exist, a / is appended to the end (to be exact: The trailing anchor is replaced with /).
url += url.endsWith("/") ? "" : "/"
I added to the regex solution to accommodate query strings:
http://jsfiddle.net/hRheW/8/
url.replace(/\/?(\?|#|$)/, '/$1')
This works as well:
url = url.replace(/\/$|$/, '/');
Example:
let urlWithoutSlash = 'https://www.example.com/path';
urlWithoutSlash = urlWithoutSlash.replace(/\/$|$/, '/');
console.log(urlWithoutSlash);
let urlWithSlash = 'https://www.example.com/path/';
urlWithSlash = urlWithSlash.replace(/\/$|$/, '/');
console.log(urlWithSlash);
Output:
https://www.example.com/path/
https://www.example.com/path/
It replaces either the trailing slash or no trailing slash with a trailing slash. So if the slash is present, it replaces it with one (essentially leaving it there); if one is not present, it adds the trailing slash.
You can do something like:
var url = 'http://stackoverflow.com';
if (!url.match(/\/$/)) {
url += '/';
}
Here's the proof: http://jsfiddle.net/matthewbj/FyLnH/
The URL class is pretty awesome - it helps us change the path and takes care of query parameters and fragment identifiers
function addTrailingSlash(u) {
const url = new URL(u);
url.pathname += url.pathname.endsWith("/") ? "" : "/";
return url.toString();
}
addTrailingSlash('http://example.com/slug?page=2');
// result: "http://example.com/slug/?page=2"
You can read more about URL on MDN
Before finding this question and it's answers I created my own approach. I post it here as I don't see something similar.
function addSlashToUrl() {
//If there is no trailing shash after the path in the url add it
if (window.location.pathname.endsWith('/') === false) {
var url = window.location.protocol + '//' +
window.location.host +
window.location.pathname + '/' +
window.location.search;
window.history.replaceState(null, document.title, url);
}
}
Not every URL can be completed with slash at the end. There are at least several conditions that do not allow one:
String after last existing slash is something like index.html.
There are parameters: /page?foo=1&bar=2.
There is link to fragment: /page#tomato.
I have written a function for adding slash if none of the above cases are present. There are also two additional functions for checking the possibility of adding slash and for breaking URL into parts. Last one is not mine, I've given a link to the original one.
const SLASH = '/';
function appendSlashToUrlIfIsPossible(url) {
var resultingUrl = url;
var slashAppendingPossible = slashAppendingIsPossible(url);
if (slashAppendingPossible) {
resultingUrl += SLASH;
}
return resultingUrl;
}
function slashAppendingIsPossible(url) {
// Slash is possible to add to the end of url in following cases:
// - There is no slash standing as last symbol of URL.
// - There is no file extension (or there is no dot inside part called file name).
// - There are no parameters (even empty ones — single ? at the end of URL).
// - There is no link to a fragment (even empty one — single # mark at the end of URL).
var slashAppendingPossible = false;
var parsedUrl = parseUrl(url);
// Checking for slash absence.
var path = parsedUrl.path;
var lastCharacterInPath = path.substr(-1);
var noSlashInPathEnd = lastCharacterInPath !== SLASH;
// Check for extension absence.
const FILE_EXTENSION_REGEXP = /\.[^.]*$/;
var noFileExtension = !FILE_EXTENSION_REGEXP.test(parsedUrl.file);
// Check for parameters absence.
var noParameters = parsedUrl.query.length === 0;
// Check for link to fragment absence.
var noLinkToFragment = parsedUrl.hash.length === 0;
// All checks above cannot guarantee that there is no '?' or '#' symbol at the end of URL.
// It is required to be checked manually.
var NO_SLASH_HASH_OR_QUESTION_MARK_AT_STRING_END_REGEXP = /[^\/#?]$/;
var noStopCharactersAtTheEndOfRelativePath = NO_SLASH_HASH_OR_QUESTION_MARK_AT_STRING_END_REGEXP.test(parsedUrl.relative);
slashAppendingPossible = noSlashInPathEnd && noFileExtension && noParameters && noLinkToFragment && noStopCharactersAtTheEndOfRelativePath;
return slashAppendingPossible;
}
// parseUrl function is based on following one:
// http://james.padolsey.com/javascript/parsing-urls-with-the-dom/.
function parseUrl(url) {
var a = document.createElement('a');
a.href = url;
const DEFAULT_STRING = '';
var getParametersAndValues = function (a) {
var parametersAndValues = {};
const QUESTION_MARK_IN_STRING_START_REGEXP = /^\?/;
const PARAMETERS_DELIMITER = '&';
const PARAMETER_VALUE_DELIMITER = '=';
var parametersAndValuesStrings = a.search.replace(QUESTION_MARK_IN_STRING_START_REGEXP, DEFAULT_STRING).split(PARAMETERS_DELIMITER);
var parametersAmount = parametersAndValuesStrings.length;
for (let index = 0; index < parametersAmount; index++) {
if (!parametersAndValuesStrings[index]) {
continue;
}
let parameterAndValue = parametersAndValuesStrings[index].split(PARAMETER_VALUE_DELIMITER);
let parameter = parameterAndValue[0];
let value = parameterAndValue[1];
parametersAndValues[parameter] = value;
}
return parametersAndValues;
};
const PROTOCOL_DELIMITER = ':';
const SYMBOLS_AFTER_LAST_SLASH_AT_STRING_END_REGEXP = /\/([^\/?#]+)$/i;
// Stub for the case when regexp match method returns null.
const REGEXP_MATCH_STUB = [null, DEFAULT_STRING];
const URL_FRAGMENT_MARK = '#';
const NOT_SLASH_AT_STRING_START_REGEXP = /^([^\/])/;
// Replace methods uses '$1' to place first capturing group.
// In NOT_SLASH_AT_STRING_START_REGEXP regular expression that is the first
// symbol in case something else, but not '/' has taken first position.
const ORIGINAL_STRING_PREPENDED_BY_SLASH = '/$1';
const URL_RELATIVE_PART_REGEXP = /tps?:\/\/[^\/]+(.+)/;
const SLASH_AT_STRING_START_REGEXP = /^\//;
const PATH_SEGMENTS_DELIMITER = '/';
return {
source: url,
protocol: a.protocol.replace(PROTOCOL_DELIMITER, DEFAULT_STRING),
host: a.hostname,
port: a.port,
query: a.search,
parameters: getParametersAndValues(a),
file: (a.pathname.match(SYMBOLS_AFTER_LAST_SLASH_AT_STRING_END_REGEXP) || REGEXP_MATCH_STUB)[1],
hash: a.hash.replace(URL_FRAGMENT_MARK, DEFAULT_STRING),
path: a.pathname.replace(NOT_SLASH_AT_STRING_START_REGEXP, ORIGINAL_STRING_PREPENDED_BY_SLASH),
relative: (a.href.match(URL_RELATIVE_PART_REGEXP) || REGEXP_MATCH_STUB)[1],
segments: a.pathname.replace(SLASH_AT_STRING_START_REGEXP, DEFAULT_STRING).split(PATH_SEGMENTS_DELIMITER)
};
}
There might also be several cases when adding slash is not possible. If you know some, please comment my answer.
For those who use different inputs: like http://example.com or http://example.com/eee. It should not add a trailling slash in the second case.
There is the serialization option using .href which will add trailing slash only after the domain (host).
In NodeJs,
You would use the url module like this:
const url = require ('url');
let jojo = url.parse('http://google.com')
console.log(jojo);
In pure JS, you would use
var url = document.getElementsByTagName('a')[0];
var myURL = "http://stackoverflow.com";
console.log(myURL.href);

Remove last element from url

I need to remove the last part of the url from a span..
I have
<span st_url="http://localhost:8888/careers/php-web-developer-2"
st_title="PHP Web Developer 2" class="st_facebook_large" displaytext="facebook"
st_processed="yes"></span></span>
And I need to take the st_url and remove the php-web-developer-2 from it so it is just http://localhost:8888/careers/.
But I am not sure how to do that. php-web-developer-2 will not always be that but it won't have any / in it. It will always be a - separated string.
Any Help!!??
as simple as this:
var to = url.lastIndexOf('/');
to = to == -1 ? url.length : to + 1;
url = url.substring(0, to);
Here is a slightly simpler way:
url = url.slice(0, url.lastIndexOf('/'));
$('span').attr('st_url', function(i, url) {
var str = url.substr(url.lastIndexOf('/') + 1) + '$';
return url.replace( new RegExp(str), '' );
});
DEMO
Use this.
$('span').attr('st_url', function(i, url) {
var to = url.lastIndexOf('/') +1;
x = url.substring(0,to);
alert(x);
})​
You can see Demo
You could use a regular expression to parse the 'last piece of the url':
var url="http://localhost:8888/careers/php-web-developer";
var baseurl=url.replace(new RegExp("(.*/)[^/]+$"),"$1");
The RegExp thing basically says: "match anything, then a slash and then all non-slashes till the end of the string".
The replace function takes that matching part, and replaces it with the "anything, then a slash" part of the string.
RegexBuddy has a great deal of information on all this.
You can see it work here: http://jsfiddle.net/xKxLR/
var url = "http://localhost:8888/careers/php-web-developer-2";
var regex = new RegExp('/[^/]*$');
console.log(url.replace(regex, '/'));
First you need to parse the tag. Next try to extract st_url value which is your url. Then use a loop from the last character of the extracted url and omit them until you see a '/'. This is how you should extract what you want. Keep this in mind and try to write the code .

Javascript remove characters utill 3 slash /

Whats the best to way, based on the input below, to get everything in the url after the domain:
var url = "http://www.domain.com.uk/sadsad/asdsadsad/asdasdasda/?asda=ggy";
var url = "http://www.domain.com.uk/asdsadsad/asdasdasda/#45435";
var url = "http://www.domain.com.uk/asdasdasda/?324324";
var url = "http://www.domain.com.uk/asdasdasda/";
The output:
url = "/sadsad/asdsadsad/asdasdasda/?asda=ggy";
url = "/asdsadsad/asdasdasda/#45435";
url = "/asdasdasda/?324324";
UPDATE: the domain its not always the same. (sorry)
Thx
You should really parse the URI.
http://stevenlevithan.com/demo/parseuri/js/
Every absolute URL consists of a protocol, separated by two slashes, followed by a host, followed by a pathname. An implementation can look like:
// Search for the index of the first //, then search the next slash after it
var slashOffset = url.indexOf("/", url.indexOf("//") + 2);
url = url.substr(slashOffset);
If the domain is always the same, a simple replace will work fine:
var url = "http://www.domain.com.uk/sadsad/asdsadsad/asdasdasda/?asda=ggy";
var afterDomain = url.replace("^http://www.domain.com.uk/", "");
You could also use RegEx:
var url = "http://www.domain.com.uk/sadsad/asdsadsad/asdasdasda/?asda=ggy";
var afterDomain = url.replace(/^[^\/]*(?:\/[^\/]*){2}/, "");
Assuming this is in the browser, creating an anchor element will do a lot of magic on your behalf:
var a=document.createElement('a');
a.href="http://somedomain/iouhowe/ewouho/wiouhfe?jjj";
alert(a.pathname + a.search + a.hash); // /iouhowe/ewouho/wiouhfe?jjj

FileName from url excluding querystring

I have a url :
http://www.xyz.com/a/test.jsp?a=b&c=d
How do I get test.jsp of it ?
This should do it:
var path = document.location.pathname,
file = path.substr(path.lastIndexOf('/'));
Reference: document.location, substr, lastIndexOf
I wont just show you the answer, but I'll give you direction to it. First... strip out everything after the "?" by using a string utility and location.href.status (that will give you the querystring). Then what you will be left with will be the URL; get everything after the last "/" (hint: lastindexof).
Use a regular expression.
var urlVal = 'http://www.xyz.com/a/test.jsp?a=b&c=d';
var result = /a\/(.*)\?/.exec(urlVal)[1]
the regex returns an array, use [1] to get the test.jsp
This method does not depend on pathname:
<script>
var url = 'http://www.xyz.com/a/test.jsp?a=b&c=d';
var file_with_parameters = url.substr(url.lastIndexOf('/') + 1);
var file = file_with_parameters.substr(0, file_with_parameters.lastIndexOf('?'));
// file now contains "test.jsp"
</script>
var your_link = "http://www.xyz.com/a/test.jsp?a=b&c=d";
// strip the query from the link
your_link = your_link.split("?");
your_link = your_link[0];
// get the the test.jsp or whatever is there
var the_part_you_want = your_link.substring(your_link.lastIndexOf("/")+1);
Try this:
/\/([^/]+)$/.exec(window.location.pathname)[1]

Categories

Resources