Splitting Web Page URL but Not URL After the Hash in JavaScript - javascript

So I tried getting the text after my URL like this "example.com/#!/THISPART" with the code below:
var webSplit = window.location.hash.split('/')[1];
And It works but if I supply another URL after like this "example.com/#!/https://example.com/" it gives this as output "https" which I don't want. I want the whole URL. Can someone help me out? Thanks!

window.location.hash.split('!/')[1]; // Outputs "https://example.com/"

Meet REGEX.
var part = location.hash.replace(/#!\//, '');
That'll give you whatever's after #!/.

The actual problem is that your approach for the split is incorrect.
With your example window.location.hash return #!/https://example.com/. What you#re doing is splitting that element by / and using the second element of the result.
The result of the split would be 4 elements ([ "#!", "https:", "", "example.com" ]) and the second element would be just he https:. What you need to do, do fix it is either find an another delimiter or find another way to extract the URL.
With your current example you could !/ for the split to get two elements back ([ "#", "https://example.com" ]) where the second one would be the whole URL, as long as it doesn't contain the string !/. Another appraoch would be to find the first occurrence of http from the left and take a substring from what position until the end of it.
Other answers show some different solution.

var webSplit = window.location.split('/#!/')[1];
Output will be :: https://example.com/ as ou want.

Related

Javascript | Link/Bookmarklet to remove variables in url

I found this and was able to do what I initially wanted.
Javascript | Link/Bookmarklet to replace current window location
javascript:(function(){var loc=location.href;loc=loc.replace('gp/product','dp'); location.replace(loc)})()
Which was to change an amazon url from the product link to the dude perfect link.
It turns this url: https://www.amazon.com/gp/product/B01NBKTPTS/
into this url: https://www.amazon.com/dp/B01NBKTPTS/
I would like to take this a step further. Is there a way to do the above switch and then also remove the string of variables after the ? essentially cleaning up
https://www.amazon.com/gp/product/B01NBKTPTS/?pf_rd_r=DQV2YXJP8FFKM1Q50KS9&pf_rd_p=eb347dce-a775-4231-8920-ae66bdd987f4&pf_rd_m=ATVPDKIKX0DER&pf_rd_t=Landing&pf_rd_i=16310101&pf_rd_s=merchandised-search-2&linkCode=ilv&tag=onamzbybcreat-20&ascsubtag=At_Home_Cooking_210426210002&pd_rd_i=B01NBKTPTS
to
https://www.amazon.com/dp/B01NBKTPTS/
Thanks!
You've almost done it yourself!
To do the second part you can use split on your /? string (i.e. URL).
In our case that will give you an array with two elements: the first element stores everything BEFORE the /? (reference [0], that's what we can use), and the other stores everything AFTER (reference [1], not needed for us)
FYI: if there were more /?, then split would produce an array with several elements. Additional information.
In addition, you shouldn't forget to escape the special character / this way: \/.
So here is the final working bookmarklet code to get the first URL part before /? letters, with gd/product replaced by dp:
javascript:(function(){
var loc=location.href;
loc=loc
.split('\/?')[0]
.replace('gp/product','dp')
+'/';
location.replace(loc);
})();

reformat characters in json data

I am retrieving data from reddit json. and some data is like that:
The actual resolution of this image is 3067x2276, not 4381x3251. See [this](https://www.reddit.com/r/EarthPorn/wiki/index#wiki_resolution.3F_what_is_that_and_how_can_i_find_it.3F) page for information on how to find out what the resolution of an image is.
i want to insert the data into <p></p> on my page but the link is as it is above (not clickable).
Notice when i try to post it on stackoverflow, it very nicely reformats into a clickable link. How do i do that?
reformatted by stackoverflow:
The actual resolution of this image is 3067x2276, not 4381x3251. See this page for information on how to find out what the resolution of an image is.
How do i achieve that?
I feel like I cheated, but inspecting the OP in my browser, I get...
<p>The actual resolution of this image is 3067x2276, not 4381x3251. See this page for information on how to find out what the resolution of an image is.</p>
In other words, if you find [words](URL), replace it with:
words
This little regex tries to capture the contents of [] followed by (). Checking for http may be insufficient depending on the sort of links you expect...
let regex = /\[(.*?)\]\(([^\)]+)\)/g;
let matches = regex.exec(line);
// matches ought to contains words and a potential url
if (matches.length > 2 && matches[2].startsWith("http://")) {
// matches[2] is probably a url, so...
let replace = `${matches[1]}`
// ...
}
Start with Regular Expressions, basically wildcards on steroids.
/\[.*\]\(.*\)/, While looking weird, will find [*](*) where * can be any length string. All this can do is find the first index of this appearing. I tried looking but i'm not the best with JS.
https://www.w3schools.com/js/js_regexp.asp

How to remove URL from a string completely in Javascript?

I have a string that may contain several url links (http or https). I need a script that would remove all those URLs from the string completely and return that same string without them.
I tried so far:
var url = "and I said http://fdsadfs.com/dasfsdadf/afsdasf.html";
var protomatch = /(https?|ftp):\/\//; // NB: not '.*'
var b = url.replace(protomatch, '');
console.log(b);
but this only removes the http part and keeps the link.
How to write the right regex that it would remove everything that follows http and also detect several links in the string?
Thank you so much!
You can use this regex:
var b = url.replace(/(?:https?|ftp):\/\/[\n\S]+/g, '');
//=> and I said
This regex matches and removes any URL that starts with http:// or https:// or ftp:// and matches up to next space character OR end of input. [\n\S]+ will match across multi lines as well.
Did you search for a url parser regex? This question has a few comprehensive answers Getting parts of a URL (Regex)
That said, if you want something much simpler (and maybe not as perfect), you should remember to capture the entire url string and not just the protocol.
Something like
/(https?|ftp):\/\/[\.[a-zA-Z0-9\/\-]+/
should work better. Notice that the added half parses the rest of the URL after the protocol.

Javascript Regular Expression for non-image url

In JavaScript, I want to extract a non-image url from a string e.g.
http://example.com
http://example.com/a.png
http://www.example.ccom/acd.php
http://www.example.com/b.jpg etc.
I would like to extract 1st and 3rd (non-image) URLs and ignore 2nd and 4th (image) URLs.
I tried the following which did not work
(https?:)?\/\/?[^\'"<>]+?^(\.(jpe?g|gif|png))
Which is the modification of the following Image URL Regular Expression (RE) to whom I added ^() (for not) for above snippet
(https?:)?//?[^\'"<>]+?\.(jpg|jpeg|gif|png)
Note: The RE in above examples is case-sensitive, if any clue for making RE case-insensitive
You can use a negative lookahead like these examples It will exclude anything with the string
assuming your urls are newline delimited like your example, something like this should work
(?!.*(jpg|jpeg|gif|png).*).*
EDIT: it looks like my example doesn't work, hopefully it is pointing oyu in the right direction at least
first removing the images:
var tmp = text.replace(/https?:\/\/[\S]+\.(png|jpeg|jpg|gif)/gi, '');
and then matching:
var m = tmp.match(/https?:\/\/[\S]+/gi);
console.log(m);

please extract a bit of info from this string (without regex so that i can understand it)

On my web app, I take a look at the current URL, and if the current URL is a form like this:
http://www.domain.com:11000/invite/abcde16989/root/index.html
-> All I need is to extract the ID which consists of 5 letters and 5 numbers (abcde16989) in another variable for further use.
So I need this:
var current_url = "the whole path, not just the hostname";
if (current_url has ID)
var ID = abcde16989;
You could always use split using / as the delimiter if the ID is always going to be in the same position, eg
var parts = current_url.split('/');
var id = parts[4];
Though your requirement of matching "5 letters and 5 numbers" really does suit a regex match.
var id = current_url.match(/[a-zA-Z]{5}[0-9]{5}/); // returns null if not found
I'm assuming you don't need the full URL, but just the pathname to get your ID. Use the following:
var current_url = window.location.pathname; //gets the pathname
var split_url = current_url.split('/'); //splits the path at each /
current_id = split_url[2]; //1st item in array is "invite", 2nd is your id, 3rd would be "root"
alert(current_id);
Firstly, this doesn't need JQuery; this is simple Javascript. I'll amend your tags after I've replied to reflect this.
A regex would actually be quite an easy way to achieve this, and I don't think a simple one like this would be as difficult to understand as you think.
So I'll answer with the regex option anyway and then move on to other options:
var url = "http://www.domain.com:11000/invite/abcde16989/root/index.html";
//first method:
var id = url.match('^http://www.domain.com:11000/invite/(.+)/root/index.html$')[1];/index.html$/)[1];
//second method: (if you don't know exact format of the rest of the URL but you do know the format of the ID string)
var id = url.match('/([a-z]{5}[0-9]{5})/')[1];
The first method will get the string in the position you specified within the URL. It won't check the formatting; it just looks at the rest of the URL and grabs the bit of it you're asking for. This should be really easy to understand: It's basically just your URL, but with (.+) where your ID goes.
The second method looks specifically for a string in the format you asked for -- ie five letters and then five numbers. This is admittedly a bit harder to read, but should be fairly self explanatory if you look at it given those criteria.
In both cases, the regex itself will return an array of results, with array element zero being the whole string (ie in the first case, including the rest of the URL). This is where the (brackets) come in (ie the bit where we said (.+)). This tells the match function to put the contents of the brackets into another array element so we can use it. In both cases, this means that we can read the ID in array element [1].
Okay, so how about the non-regex options:
In fact, it's going to be quite hard to do it in a simple way without regex in Javascript, since even the simple string splitting function uses a regex match to do the split (granted it would be a very simple one, it is still a regex). A couple of other people have already given you answers using this, but it is still a regex, so technically they've also not answered your question accurately.
I'm going to guess that actually one of these answers will be good enough for you (either mine or more likely one of the answers using split()), despite there still being a regex element. However if you really don't want anything to do with regex, you're going to have to start doing some slightly more complex string manipulation, probably using substring() (though there are other ways to do it).
Something along the lines of this:
var prefixstring="http://www.domain.com:11000/invite/";
var prefixlen=prefixstring.length;
var idlen=10;
var id = url.substring(prefixlen,idlen+prefixlen);
This gets the length of the portion of the URL in front of the ID, and then uses substring() to snip out the required bit. But I'm sure you'll agree that the regex options are simpler? ;-)
Hope that helps. (and I hope it helps you feel less afraid of regex!)

Categories

Resources