how can i make url shortener api in express.js? - javascript

I'm doing a link shortener. The link shortener works very well but now I'm trying to do API for it. The problem is that if I pass the URL argument in the get URL it's not working. I tried a lot of things but it's not working. When I do like http://localhost:3500/api/create/google.com it works but when I do http://localhost:3500/api/create/https://google.com it's not working because of the https://. These are the last 3 inputs via my API that failed: http://https:google.com, google.com, http://
I'm using express and mongoose. Here's my code:
app.get('/api/create/:shortUrl(*)', async (req, res, next) => {
if (req.params.shortUrl.includes("https://") || req.params.shortUrl.includes("http://") || req.params.shortUrl.includes("/")) {
req.params.shortUrl = req.params.shortUrl.replace("https://", "").replace("http://", "").replace("/", "")
}
if (req.params.shortUrl == "") {
res.send("invalid URL")
}
await shorturl.create({full: `http://${req.params.shortUrl}`})
const shortUrls = await shorturl.find().sort({_id: -1}).limit(-1)
const latest = shortUrls[0]
res.send("https://p33t.link/" + latest.short)
});

You have to properly encode portions of the URL that contain restricted characters such as : and // that aren't part of the actual protocol to make it a legal URL. So, the idea is that you encode the "parameter" before appending it to the URL. Presumably, you would use encodeURIComponent(), depending upon exactly where you're placing it in the URL.
After parsing the core part of the URL, a web server will decode the remaining components of the URL and give you back the original characters. I would suggest that your particular use would probably work better as a query parameter rather than a part of the path which would give you this when properly encoded:
http://localhost:3500/api/create?u=https%3A%2F%2Fgoogle.com
And, you could then use:
app.get('/api/create', (req, res) => {
console.log(req.query.u);
...
});

Related

when i try to redirect to decoded url it will redirect me to my myurl.com/myurl instead taking me to the page

I'm making a link shortener. I previously got a problem with the URLs but it got fixed with encoding the URL when putting it in the database and when redirecting it will decode the URL and redirect to it. The problem is that instead of redirecting me to like https://google.com it redirects me to mypage.com/google.com. I tried making a "debug" page when it just decode the URL and the URL is fine, with HTTPS:// and everything. The biggest problem is that it's all working on localhost but when I deploy it on my VPS it's not working. Only the debug page that decodes the URL works. I'm using express.js and mongoose. Here's my code for redirecting users:
app.get('/:shortUrl', async (req, res) => {
const shortUrl = await shorturl.findOne({ short: req.params.shortUrl })
if (shortUrl == null) {
res.send('URL not found!')
} else {
shortUrl.clicks++
shortUrl.save()
res.redirect(decodeURIComponent(shortUrl.full))
}
})
You can use the built in URL object in order to make sure the redirect URL is full and valid:
res.redirect(new URL(decodeURIComponent(shortUrl.full)).toString())
If it is unable to make a valid URL from the input, it will throw, so it's better to wrap it in a try/catch.
try {
res.redirect(new URL(decodeURIComponent(shortUrl.full)).toString());
} catch (e) {
res.send('Invalid URL');
}
If the URL does not have a scheme, in a browser it assumes the scheme is HTTP, but in HTTP redirects the domain looks like a path with a dot. If you redirect to a path, it will redirect on the same domain, which explains the behavior with google.com.
Try normalizing the URL or validating the full URL includes the scheme.
https://github.com/sindresorhus/normalize-url
https://developer.mozilla.org/en-US/docs/Learn/Common_questions/What_is_a_URL

How to deal with Path Traversal?

I'm trying to understand how to deal(in a secure way) with Path Traversal.
For example an application receives from a client a file name via REST API in JSON, look for it in the non-accessible(by outside) directory and retrieve a response with the file:
app.get('/', (req, res) => {
const fileName = req.body.fileName;
// some code...
fs.stat(`./nonAccessibleDir/${fileName}`, async function(err, stat) {
// some code...
});
// some code...
}
The problem with the above approach is that a client can send something like "../" in the fileName request and it will "eat" it without an issue. How can one deal with this kind of scenarios, what and how I should fix this vulnerability, please?
Update:
Sorry, but I forgot to mention that I know I should check the input I receive, but what if I need to pass the "/" and "." in the input? Also, if I don't need this characters, is that all I need to check to remove the Path Traversal vulnerability?
An easy way would be to validate the fileName through a regex that detects any ../ segments and returns an error if any are present.
if (fileName.match(/\.\.\//g) !== null) {
// return an api error
}
You could have quite a tight validation rule that prevents any forward slashes in fileName at all, making it only possible to point to a file directly in your desired directory.

How to pass an URI containing a hash as a route parameter to express?

I have the following route in my express application :
app.get('/api/:URI', (req, res) => {
doStuff();
}
The URI parameter passed is an URI encoded on the client side with encodeURIComponent()
It works fine except when the URI contains a hash.
Example: http://foo.bar/foobar/bla#blabla-313fe4ce-4f8d-48b7-b0f3-a59844402ee8
In this case the route is ignored.
On the browser side I receive a code 301, then the result of the next valid route.
If I remove the hash or, weirder, if I disable the cache on the browser side it works perfectly.
Is there any way express can ignore the hash ?
Edit : It's absolutely not a Can I use an at symbol (#) inside URLs? duplicate, the question is more about express routing and/or about browsers cache issues than about allowed characters in an URL.
Is there any way express can ignore the hash ?
I tried using the OR operator. For example,
app.get('/blog' || '/blog#top', (request, response) => {
...
});
So it works even if #top is present or not.

decodeUri not working with res.json express

Have an express app that saves a sanitized url to a mongodb database and I want to render the decoded url in a res.json using decodeURI() but it doesn't work as expected and only gives the encoded version back. If I do a res.send(decodeURI(url)) it works. How can I get the res.json to send the decoded url.
// Create a url object with escaped and trimmed data.
var Url = new UrlModel(
{ url: req.body.url }
);
if (!errors.isEmpty()) {
// There are errors. Render the form again with error messages.
res.render('index', { errors: errors.array()});
return;
}
else {
// Data from form is valid.
// Check if Url with same name already exists.
UrlModel.findOne({ 'url': req.body.url })
.exec( function(err, found_url) {
if (err) { return next(err); }
if (found_url) {
// Url exists, redirect to its detail page.
res.json({"original_url": decodeURI(found_url.url) });
//res.send(decodeURI(found_url.url))
}
Update:
Probably wasn't clear in my question. My input is from a mongodb with a sanitized url in the form
https://www.facebook.com
so its html entities that i want to convert and I dont think that decodeUri does that.
My out put from this code
res.json({original_url:found_url.url, decoded: decodeURI(found_url.url) });
is {"original_url":"https://www.facebook.com","decoded":"https://www.facebook.com"}
so the // in the url is not being converted to // . Is there some core javascript function that does this or do I have to use a function with regx and replace?
Updated after question update.
In JavaScript you have some functions to accomplish a similar conversion: encodeURI and encodeURIComponent, and their counterparts decodeURI and decodeURIComponent. encodeURI is used to safely encode a full URL as it won't encode the protocol, hostname or the path; encodeURIComponent will encode everything.
What you are showing in the edited question has nothing (as far as I can tell) to do with JavaScript; you need to get the backend to unsanitize that string before sending it back to you.
If updating the backend is not an option, you could try something like this:
unescape('https://www.facebook.com'.replace(/&#x/g, '%').replace(/;/g, ''))
This will decode those entities into their actual characters, but it should not be a permanent solution as it is marked as deprecated.
Original response.
I am having no issues at all with encodeURI and decodeURI. Are you completely sure it is not being returned as expected? Is there a chance something else in the middle is encoding it again?
I tested this small snippet with Postman.
const express = require('express');
const app = express();
const encoded = encodeURI('http://example.com?query=ÅÍÎÏ˝ÓÔÒÚÆ☃');
const decoded = decodeURI(encoded);
app.get('/json', (req, res) => res.json({ encoded, decoded }));
app.listen(3000, () => console.log('Example app listening on port 3000!'));

How do I pass a whole URL with http e.g. "http://www.facebook.com" as a parameter to node/express "/:param"?

I can't seem to pass a whole url e.g. "http://example.heroku.com/http://www.facebook.com"
app.get('/:url', function(req, res){
var url = req.params.url;
// do something with url...
}
I always get an error that says "Cannot GET /http://www.facebook.com".
How do I get past this?
Some characters (like /) have special meaning in URLs and need to be encoded.
http://example.heroku.com/http%3A%2F%2Fwww.facebook.com
Most programming languages have a function (possibly via a third party library) which can encode that for you. In JavaScript, for instance, that is encodeURIComponent.
You can use regular expression. For example:
// http://localhost:3000/mountpoint/http://www.facebook.com
app.get( /^\/mountpoint\/(.*)/, function(req, res) {
var url = req.params[0];
res.json(url);
});
Thanks for the comments, answers but I've found out I could use the wildcard and I was able to get the whole URL parameter without running into the 'Cannot GET /' error
'/*'

Categories

Resources