Fetch non-public data from S3 - javascript

I have been working on a webapplication, where I scrape data by using Scrapy and launch the data on S3. Now I want to fetch the data to my React project. This is working well, if I set the data to be public.
axios
.get(`https://s3-eu-west-1.amazonaws.com/bucket/data.json`)
.then(res => {
console.log("Data: ", res.data);
this.setState({ events: res.data });
})
.catch(console.log("error"));
Question
I don't want the data I'm scraping to be public and should only be available for my webapplication. Is this even possible?

I'm assuming you're talking about a client-side web application that runs in the user's browser? As far as I know, you need at least some server-side component to control or allow access to private S3 resources. This could be a lambda function or an actual server, but AFAIK there is no safe way to do this from the client only.
There are two ways that I'm aware of to expose private S3 resources to a client-side app:
If there is a server under your control (EG a NodeJS server that delivers your app, or perhaps provides API services) you can connect to S3 securely from the server side and deliver whatever you need to the client side. This could also be done from a lambda function. Whatever you choose you still need a way to make sure the client/app requesting the content should have access to the content, EG the user should have a valid session.
You could allow access to a private S3 object by generating a pre-signed URL that gives the client app some fixed amount of time to download the content. This is probably an endpoint on your server (or lambda) that your client-side app calls only after making sure the user that requested it is authorized.
Here's a tutorial on Medium that explains both options: https://blog.fyle.in/sharing-files-using-s3-pre-signed-urls-e05d4603e067
Here's a StackOverflow answer with example code for Node: Nodejs AWS SDK S3 Generate Presigned URL

Related

How do I call a Node/Express API key (.env) in Client-side Javascript?

Right now I am using an OpenWeatherMap API key in my client side javascript for a simple weather app (Node/Express). I know this is not ideal outside of development, so I did npm install dotenv.
On the server side, I can get and set the env variables just fine in Node. I can see them when I console.log out.
How do I call the API key in my javascript on the client-side? For example, currently my weather app has its simple logic in a file called weather.js and the HTML uses weather.js.
Ideally I would just like to call my api like http://api.openweathermap.org/data/2.5/forecast/daily?lat=${lat}&lon=${lon}&units=metric&appid=${process.env.WEATHER_API_KEY}
I know the .envs are on server side and you have to do stuff to make it work client side. New Node developer here who has read too much that I think I am confused between requireJS, Browserify, modules, .env, etc...
You don't want your API keys (or other secrets) to be public. Using them in the front-end would make them visible when inspecting the page and in the network requests log. You need to store and use your secrets server-side.
Create a route on your backend (which you protect from being used by other domains using CORS) which calls the weather API (using the token stored in .env on your server) and sends back the data.
Then have your frontend hit that route.
You will have to request the API Key from the server.
This can be done easily by making a simple route in your backend that will return the key as a response.
If you don't want to expose your API Key (I recommend you to not expose it), what you can do is create a route in your backend that will make a call to the WeatherAPI using your API key, and the client will send HTTPS request to your backend, which will then create another HTTPS request to the WeatherAPI and send the response back to the client.
You don't want to expose your API keys to outside world. What you can do is to create backend route (/api/keys) make it protected with CORS and call it from front-end.

Can an AWS S3 Static Site access REST API in VPC?

I've read through quite a few pages of documentation and other StackOverflow questions/answers but can't seem to come across anything that can help me with my scenario.
I'm hosting a public, static site in an S3 bucket. This site makes some calls to an API that I have hosted in an EC2-instance in a VPC. Because of this, my API can only be called by other instances and services in the same VPC.
But I'm not sure how to allow the S3 Bucket site access to my API.
I've tried creating VPC Endpoints and going down that route, but all that did was restrict access to my S3 site from only the instances within my VPC (which is not what I want).
I would appreciate any help with this, thank you so much.
Hopefully my question is clear.
No, S3 Static Websites are 100% client side code. So it's basically just html + css + javascript being delivered, as-is from S3. If you want to get dynamic content into your website, you need to look at calling an API accessible from your user's browser, i.e. from the internet.
AWS API Gateway with Private Integrations could be used to accept the incoming REST call and send it on to your EC2 Server in your VPC.
My preferred solution to adding dynamic data to S3 Static Websites is using API Gateway with AWS Lambda to create a serverless website. This minimises running costs, maintenance, and allows for quick deployments. See The Serverless Framework for getting up and running with this solution.
A static site doesn't run on a server. It runs entirely in the web browser of each site visitor. The computer it is running on would be the laptop of your end-user. None of your code runs in the S3 bucket. The S3 bucket simply stores the files and serves them to the end-user's browser which then runs the code. The route you are going down to attempt to give the S3 bucket access to the VPC resource is not going to work.
You will need to make your API publicly accessible in order for code running in your static site (running in web browsers, on end-user's laptops/tablets/phones/etc.) to be able to access it. You should look into something like API keys or JWT tokens to provide security for your public API.

Secure access directly from web app to amazon s3?

Per my review of how to setup secure access to amazon s3 buckets it looks like we first generate an IAM user and then tie a security policy allowing s3 access to that user. After that we can generate API keys for the bucket, which can authenticate request for bucket access. That's my understanding at this point, please correct me if I missed something.
I assume the API keys should be server side only (The Secret Access Key). In other words it's not safe to place these directly inside the webapp? Hence we would first have to send the data to our server, and then once there we can send it to the bucket using the API key?
Is there any way to secure access directly from a web app to an amazon s3 bucket?
Approach Summary
Per the discussion with #CaesarKabalan it sounds like the approach that would allow this is:
1) Create an IAM user that can create identities that can be authenticated via Amazon Cognito - Lets call the credentials assigned from this step Cognito Credentials.
2) The user signs in to the webapp with for example Google
3) The webapp makes a request to the webapp's server (Could be a lambda function) to signup the user with Amazon Cognito
4) The webapp now obtains credentials for the user directly from Amazon Cognito and uses these to send the data to the s3 bucket.
I think that's where we are conceptually. Now it's time to test!
From your question I'm not sure what portions of your application are in AWS nor your security policies but you basically have three options:
(Bad) Store your keys on the client. Depending on the scope of your deployment this might be ok. For example if each client has it's own dedicated user and bucket there probably isn't much risk, especially if this is for a private organization where you control all aspects of the access. This is the easiest but less secure. You should not use this if your app is multi-tenant. Probably move along...
(Great) Use an API endpoint to move this data into your bucket. This would involve some sort of infrastructure to receive the file securely from the client then move it into S3 with the security keys stored locally. This would be similar to a traditional web app doing IO into a database. All data into S3 goes through this tier of your app. Downsides are you have to write that service, host it, and pay for bandwidth costs.
(Best) Use Amazon Cognito to assign each app/user their own access key. I haven't done this personally but my understanding is you can provision each entity their own short-lived access credentials that can be renewed and you can give them access to write data straight to S3. The hard part here will be structuring your S3 buckets and properly designing the IAM credentials for your app users to ONLY be able to do exactly what you want. The upside here is the users write directly to S3 bucket, you're using all native AWS services and writing very little custom code. This I would consider the best, most secure, and enterprise class solution. Here is an example: Amazon S3: Allows Amazon Cognito Users to Access Objects in Their Bucket
Happy to answer any more questions or clarify.

Web API Security Information Request

I would to ask a few questions to better understand some procedures. I'm trying to write a web api project which will be a backend for both web and mobile clients.
The problem that i've in mind is about security. I don't want to use Identity or any other providers. I want to use my own database user and role structures.
Only authenticated client applications should be consuming my application. So that anonymous applications should not consume it.
So what should be the approach ? I 've written a custom AuthorizationAttribute and check some custom headers like "AppID","AppSecurity" key which i store in my own database and if the client sends the right appId and the key it means the app is authenticated to consume the API which does not sound very secure to me.
Another issue is that ; Lets say i've developed a javascript web application and i've to first authenticate the application itself before making GET/POST/PUT/DELETE etc requests which means i've to add some kind of authentication data like username, appkey, password in one of the js files for sending the "AppID" and the "AppSecurity" keys in the header. A client who knows how to use some developer tools or fiddler can easily capture my header values that are being sent to the server side? Even if i pass authentication values on the body of my json request it still can be found on the js files that are sent to the client. I'm also confused about that tooƧ
So basically i want to build a server side api that will serve the data and get data from the authenticated client applications only. What i need is a simple example for that without using any identity providers.

Uploading files on web from client without revealing API key

I'm trying to upload a file from a web application to an external source (such as scribd) for example. to upload the file I need to send the API key as well. however if i send the API key from the client it will be revealed to users who search for it on the client side.
How could I upload from client using an API key that I don't want to reveal to users? It seems redundant to upload it to my server and then to the external source.
As redundant as it may be to pass through your server, it's the only way. You can't use the key client-side and hide it from the client, and if you don't use HTTPS it can easily be intercepted too. As a side note, I don't know about Scribd but usually stealing API keys is not very useful, so you may just live with the "risk".
Edit:
apparently Scribd offers a way to provide encrypted requests so that your API key can't be deduced by them (you have to generate these remotely and send them to the client of course). See http://www.scribd.com/developers/api?method_name=Signing

Categories

Resources