Node.js how to split querystring value by index - javascript

I am trying to figure out how to split a querystring's values in Node.js. This is for a web proxy I am creating. I need to split the querystring by the third '/' for example. https://example.org/bahahhaah to https://example.org. It will also be nice if I knew how to split by the last one for example https://example.org/bahhaa/s2.html to https://example.org/bahhaa. I wanna have these two outputs save into a cookie. I am not sure what code to put for example. If I am not being clear enough, please tell me.

What you are looking for, is the built-in Node.js module URL. It will parse things out see this question's answer, so you can get the pathname. Then, you can .split the pathname to find the things you are looking for:
const url = require('url');
const URLparts = url.parse('https://example.org/bahahhaah').pathname.split('/');
console.log(URLParts); // yields [`bahahhaah`]
URLParts now yields an array of your path parts, separated by the / at the end of the URL.

Related

How to pass a "?" as a routing parameter in express while using node js

I am creating a blog website for my college using express and nodejs this might be a silly question but I need an answer.
as you can see from the following piece of code-
enter code here
cons express = require("express");
.....
.....
app.get("/blog/:title_of_blog", function(request, response){
var title_requested = request.params.title_of_blog;
console.log(title_requested);
})
It works fine in all the cases I tried except when if a user enters a string in as a routing parameter like "what is your name?" then it console logs only "what is your name"
So a question mark gets excluded here on which further process depends as it should be exactly the same,
Is there any way I could fix this???
If you do need any additional information please do let me know
Because express will understand ? is for starting a query string. So usually if you want to put blog title to URL, you can parse title to slug using some lib.
In JavaScript, PHP, and ASP there are functions that can be used to URL encode a string.
PHP has the rawurlencode() function, and ASP has the Server.URLEncode() function.
In JavaScript you can use the encodeURIComponent() function.
source
When you apply encoding on your blog title and pass URL Become like this
/blog/what%20is%20your%20name%20%3F
You will got the exact result as you want. :)
certain Characters such as ?, # are not allowed in URL Parsing or I recently discovered even while storing cookies, but there is a simple way to avoid these problems and get your string in the desired way
var encodedString = encodeURI(rawString);
this above line turns all the characters such as # into an easily parsable string and after the string is processed whenever you need the actual string you just have to type one line to get the original string
var originalString = decodeURI(enodedString);

How can I use regexes in Javascript to chop part of a url without specifying fixed length?

I'm writing a program to match paths from my filesystem with urls pulled from an SQL database, using Javascript. The URLs pulled are structured like this:
http://examplesite.com/wp-content/uploads/YYYY/MM/17818380_1556368674373219_6750790004844265472_n-1.jpg
http://examplesite.com/wp-content/uploads/YYYY/MM/17818380_1556368674373219_6750790004844265472_n.jpg
https://examplesite.com/wp-content/uploads/YYYY/MM/10643960_909727132375975_2074842458_n-44x55.jpg
http://examplesite.com/wp-content/uploads/YYYY/MM/10643960_909727132375975_2078842458_n-320x150.jpg
etc. Some have http, some https.
I tried to match the files with the urls with
if(files[i] === urlsfromdb[j].substring(50,urlsfromdb[j].length-4))...
I want to get everything after the / after ...MM, but above sometimes includes the leading slash, which in turns ruins the program. How can I accomplish this with regexes? I wanna get all the jpgs, and I'm using NPM glob to do so.
Additionally, with the files that have -WWWxHHH.jpg, which could be 2 or 3 Ws or Hs, I want to delete those files as well; the URLS from the DB will never actually have them but the files will.
use a regular expression to remove everything up to the last slash.
urlsfromdb[j].replace(/^.*\//, '')
If you just want the JPEG files you can use a capture group given that all the URLs have MM/ before the image name. Something like this should work:
let regex = /.*MM\/([A-Za-z_0-9-]+.jpg)/g;
let match = urlsfromdb[j].match(regex)
let image = match[1]

Regex to get the domain from URL, and thing

There are lots of posts online like this, but none of them seem to do what I'm trying to do.
Let's say I have a domain in a string:
Extract hostname name from string
And I want to extract the domain name and nothing else (not the protocol, the subdomain or the file extension).
so for
https://stackoverflow.com/questions/8498592/extract-root-domain-name-from-string
I want to get:
stackoverflow.com
Is there any way to do this?
Try this on:
var url = 'http://stackoverflow.com/questions/8498592/extract-root-domain-name-from-string';
var domain = url.match(/^https?:\/\/([^\/?#]+)/)[1];
alert(domain);
This looks for a string that starts with http and optionally s, followed by ://, then matches everything it can that is not a /. But .match() returns an array here:
['http://stackoverflow.com', 'stackoverflow.com']
So, we use [1] to get the submatch.
You can use a simple regex like this:
\/\/(.*?)\/
Here you have a working example:
http://regex101.com/r/iP0uX7/1
Hope to help

Javascript url validation allowing relative and absolute urls

I'm trying to validate a field to allow relative and absolute urls. I'm using the regex from this post but it is allowing spaces in the url.
var urlRegex = new RegExp(/(\/?[\w-]+)(\/[\w-]+)*\/?|(((http|ftp|https):\/\/)?[\w-]+(\.[\w-]+)+([\w.,#?^=%&:\/~+#-]*[\w#?^=%&\/~+#-])?)/gi);
Example:
// this should work
this/will/work.aspx?say=hello
http://www.example.com/this/will/work.aspx?say=hello
// this shouldn't work but does
and/this will also work/even though it shouldn't
and/this-shouldn't/but it does/also
The code below is what I was originally using to validate just absolute urls and it was working perfectly. If I remember properly, I pulled it from the jquery source. If this could be modified to also accept relative urls that would be perfect, but this is out of my league.
var urlRegex = new RegExp(/^(https?|ftp):\/\/(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*#)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)*)*)?)?(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?$/i);
I think you just need to anchor the pattern so that it has to match the whole string:
var urlRegex = /^(\/?[\w-]+)(\/[\w-]+)*\/?|(((http|ftp|https):\/\/)?[\w-]+(\.[\w-]+)+([\w.,#?^=%&:\/~+#-]*[\w#?^=%&\/~+#-])?)$/gi;
The leading ^ and trailing $ means that the pattern has to match the entire string instead of just some part of it.
edit that said, the pattern has other problems. First, those HTML entities for & (&) need to be just "&". The slashes don't need to be escaped in [] groups, and we don't need the "g" suffix. That leaves us with:
var urlRegex = /^(?:(\/?[\w-]+)(\/[\w-]+)*\/?|(((http|ftp|https):\/\/)?[\w-]+(\.[\w-]+)*([\w.,#?^=%&:/~+#-]*[\w#?^=%&/~+#-])?))$/i;
edit again - oops also need to wrap the whole thing.
I wrote an article about URI validation complete with code snippets for all the various URI components as defined by RFC3986 here:
Regular Expression URI Validation
You may find what you are looking for there. Note however that almost any string represents a valid URI - even an empty string!

please extract a bit of info from this string (without regex so that i can understand it)

On my web app, I take a look at the current URL, and if the current URL is a form like this:
http://www.domain.com:11000/invite/abcde16989/root/index.html
-> All I need is to extract the ID which consists of 5 letters and 5 numbers (abcde16989) in another variable for further use.
So I need this:
var current_url = "the whole path, not just the hostname";
if (current_url has ID)
var ID = abcde16989;
You could always use split using / as the delimiter if the ID is always going to be in the same position, eg
var parts = current_url.split('/');
var id = parts[4];
Though your requirement of matching "5 letters and 5 numbers" really does suit a regex match.
var id = current_url.match(/[a-zA-Z]{5}[0-9]{5}/); // returns null if not found
I'm assuming you don't need the full URL, but just the pathname to get your ID. Use the following:
var current_url = window.location.pathname; //gets the pathname
var split_url = current_url.split('/'); //splits the path at each /
current_id = split_url[2]; //1st item in array is "invite", 2nd is your id, 3rd would be "root"
alert(current_id);
Firstly, this doesn't need JQuery; this is simple Javascript. I'll amend your tags after I've replied to reflect this.
A regex would actually be quite an easy way to achieve this, and I don't think a simple one like this would be as difficult to understand as you think.
So I'll answer with the regex option anyway and then move on to other options:
var url = "http://www.domain.com:11000/invite/abcde16989/root/index.html";
//first method:
var id = url.match('^http://www.domain.com:11000/invite/(.+)/root/index.html$')[1];/index.html$/)[1];
//second method: (if you don't know exact format of the rest of the URL but you do know the format of the ID string)
var id = url.match('/([a-z]{5}[0-9]{5})/')[1];
The first method will get the string in the position you specified within the URL. It won't check the formatting; it just looks at the rest of the URL and grabs the bit of it you're asking for. This should be really easy to understand: It's basically just your URL, but with (.+) where your ID goes.
The second method looks specifically for a string in the format you asked for -- ie five letters and then five numbers. This is admittedly a bit harder to read, but should be fairly self explanatory if you look at it given those criteria.
In both cases, the regex itself will return an array of results, with array element zero being the whole string (ie in the first case, including the rest of the URL). This is where the (brackets) come in (ie the bit where we said (.+)). This tells the match function to put the contents of the brackets into another array element so we can use it. In both cases, this means that we can read the ID in array element [1].
Okay, so how about the non-regex options:
In fact, it's going to be quite hard to do it in a simple way without regex in Javascript, since even the simple string splitting function uses a regex match to do the split (granted it would be a very simple one, it is still a regex). A couple of other people have already given you answers using this, but it is still a regex, so technically they've also not answered your question accurately.
I'm going to guess that actually one of these answers will be good enough for you (either mine or more likely one of the answers using split()), despite there still being a regex element. However if you really don't want anything to do with regex, you're going to have to start doing some slightly more complex string manipulation, probably using substring() (though there are other ways to do it).
Something along the lines of this:
var prefixstring="http://www.domain.com:11000/invite/";
var prefixlen=prefixstring.length;
var idlen=10;
var id = url.substring(prefixlen,idlen+prefixlen);
This gets the length of the portion of the URL in front of the ID, and then uses substring() to snip out the required bit. But I'm sure you'll agree that the regex options are simpler? ;-)
Hope that helps. (and I hope it helps you feel less afraid of regex!)

Categories

Resources