Run computation on user AWS account - javascript

I am having a Node.js and Vue.js project, where a user is going to provide his AWS credentials, a pointer to some online resource (which stores a large amount of data), and some algorithm on this data is going to be run on user's AWS account that he/she provided.
For this purpose, I am having two difficulties and I would like to ask for some help.
Firstly, I want to deploy some simple javascript code in the cloud, to test that everything works. What is the easiest way to do that? How can the npm packages aws-sdk and aws-lambda help me? Do I necessarily need to give my debit card details to make use of AWS just for quick testing purpose?
The second thing is, is there an authorization library/tool that AWS offers, like Facebook, for example, so the user just needs to input his username and password into a window, and he/she is automatically authorized (with OAuth, probably that's what they are using).
In addition, I would appreciate any general advice on how to approach this problem, how can I run code on huge amount of data on cloud user accounts? Maybe another cloud platform is more appropriate? Thank you!

This is a big question. so I'll provide some pointers for you to do further reading on:
to start with, decide if you want your webapp to be server-based (EC2, Node.js, and Express) or serverless (CloudFront, API Gateway, Lambda, and S3)
learn how to use Cognito as a way to get AWS credentials associated with a social provider login (such as Facebook or Google)
to operate in another user's AWS account, you should leverage cross-account IAM roles (they create a role and give you permission to assume it)
on the question of running code against large amounts of data, the repository for this data will typically be S3 or perhaps Redshift in some situations, and the compute environment could be any one of Lambda (short lifetime, serverless), EMR (clustered Hadoop, Spark etc.), EC2 (vanilla VM), Athena (SQL queries over content in S3), or ECS (containers). You haven't given enough information to help decide which might be more suitable.
The simplest option to test things out is likely to be S3 (for storage) and EC2 (use t2.micro instances in Free Tier, deploy your web application just like you would on any other Linux environment).
Yes, to use AWS you need an account and to get an account you need to supply a credit card. There is a substantial Free Tier in your first year, however.

Related

My bot cannot access to blob storage account with a system assigned managed identity

I'm exploring using Azure blob storage with my bot. I'd like to use it as a persistent store for state, as well as storing transcripts.
I configure the BlobStorage object like this:
storageProvider = new BlobStorage( {
containerName: process.env.BlobContainerName,
storageAccountOrConnectionString: process.env.BlobConnectionString
} );
As sensitive information is stored in these files, especially transcripts, I'm working with my team on securing the storage account and the container within it.
We have created a system assigned managed identity for the application service hosting the bot, and we have given this account the 'Storage Blob Data Contributor' role. Which, as I understand it, provides read, write and delete access to content stored.
Unfortunately when the bot tries to access the storage the access attempt fails. I see the following error in the 'OnTurnError trace':
StorageError: Forbidden
Interestingly running the bot locally with the same blob storage connection string works. Suggesting that this issue is related to the service identity and/or the permissions that it has.
Does anyone know what could be causing the error? Are more permissions required to the storage account? Any thoughts on increasing the logging of the error to potentially see a more detailed error message is also most welcome.
At this moment in time I do not believe that the framework supports using a system assigned managed identity for access to the blob storage.
In looking into this I found a number of examples of Node.js that use two specific packages for accessing blob storage using a system assigned identity. Specifically:
#azure/storage-blob - https://www.npmjs.com/package/#azure/storage-blob
#azure/identity - https://www.npmjs.com/package/#azure/identity
The identity package is the one that provides the functionality to get a token associated with a credential, that is then used by code in the storage-blob package to interact with the storage account.
If I look at the dependency tree for the bot framework I don’t see either of these packages. Instead I see:
azure-storage - https://www.npmjs.com/package/azure-storage
botbuilder-azure - https://www.npmjs.com/package/botbuilder-azure
Taking a deep dive into these two packages I don’t see any code for connecting to an azure storage account that uses a credential. The only code I can find uses access keys. Therefore my conclusion currently is that the bot framework doesn’t support accessing a storage account using a credential.
While we could explore adding code that uses these packages, such significant development work is outside the scope of our project at present.
If anyone with more knowledge than I can see that this is incorrect please let me know via a comment and I'll explore further.
For the moment we have settled on continuing to use access keys. As it is not any less secure than the way the bot accesses other services. Such as the cognitive services like QnA Maker.

Condition based access to amazon lamda results?

Wondering if it's possible to have a webapp upload a file (userid.input.json) to Amazon S3, which triggers a lambda function that reads the file, does some processing, and saves the result as another (userid.output.json).
However userid.output.json should not be immediately accessible to the web application. The webapplication has to complete a Stripe payment and once the payment completes, the web application can access the (userid.output.json) file on amazon s3.
Before I ask how, I figured I'd first ask if this this scenario can be facilitated / architected on AWS?
Approach
Note that this is an update to the question based on more research. It looks like Amazon Cognito will be the perfect tool for signing in users and tying their user credentials to an IAM role that can read and write to S3 buckets.
So once the user is signed in through Amazon Cognito and has the proper credentials then their files can be uploaded to an S3 bucket and processed by a lambda. The result is then written to the same bucket.
Now earlier I suggested writing to a sealed bucket and having a Stripe webhook trigger moving the result from the sealed bucket to an accessible bucket. But it seems this is necessary, per the indication in the answer provided by #Snickers3192.
Once the stripe payment completes the webapp can set a boolean that is used to control access to the output and that completes the cycle?
Part of the rational for having a hidden bucket was that someone might pull the credentials out of the browser and execute them in a different script. I assume this is impossible (Famous last words :) ), but just in case I wrote a follow up question here.
In other words the credentials that are pulled into the client post signin with Amazon Cognito cannot be used to executed scripts outside of the application context?
Approach Part 2
Per my follow up questions it does not appear that relying on state within the webapp for making security decisions is good enough, as someone can probably figure out a way to get the token authentication token and manipulate the applications API directly using a client other than the core app.
So now I'm thinking about it like this:
1) Write the result to the sealed bucket (Processing Lambda)
2) Have the Stripe webhook update the users a transaction record in the users profile indicating payment paid = true (Stripe Lambda)
3) Create another lambda that has access rights to the sealed bucket but will return results only if paid=true. (Access Result Lambda)
So since Stripe is tied to an IAM user that is allowed to update the Application user profile and set paid=true and the sealed bucket can only be accessed by lambda that first checks if paid=true before returning the result, I believe that should guarantee security.
If anyone has a simpler approach please let me know.
This really is more a question of where you want to put the security, which in AWS there are many options, in your application logic which could mean:
Lambda/Webapp
S3 policies
IAM roles/groups
These decisions are usually dictated by where your identity store is kept, and also if you want to keep the notion of AWS users VS. users of your app. My preference is to keep these two pools separate, in that security logic like this is kept in the webapp/lambda and AWS security only deals with what rights developers have to the environment as well as what rights applications themselves have.
This means the webapp can always access the input and output buckets, but it keeps a record in a database somewhere (or makes use of your payment system API) who has paid and who hasn't paid and uses that information to deny or grant access to users. IMO this is a more modular design, and it enables you to lock down your AWS account better and is more clear to developers where security is located. In addition if you do go with IAM/S3 it will be more difficult to run and debug locally.
EDIT: After all your comments and additional security concerns you may also want to consider emailing a short lived URL link to the processed file, so that a user needs both email access as well as knowing their credentials to the application. This will mean even if your access token is stolen at the browser level, without the email access a hacker still can't get the processed file. If you want to be EXTREME SECURITY CORE, have the link that not only is authentication required, but also MFA so that they need to enter in a code which is constantly refreshing as you should have setup for your AWS account when you login.
I'm by no means a security expert but just follow best practices and do your due diligence and you will meet security expectations.

Electron SQL security

I have a rather noob question that I can't seem to find the answer for. So I've heard that all electron apps can be turned into source code and then manipulated. So that leads me to my next question. If I'm connecting to a SQL database then what is keeping people from viewing source code, going in and doing whatever they want to the db? I mean once they see the source code the username and password are right there...Sorry if this is a silly question but I'm thinking of making something on electron that needs decent security. I've also heard php cannot be used. So... Any suggestions would be appreciated. I'm just wondering because Discord, whatsapp and such seem to do it somehow, but how?
Thanks!
Well, any information in any application can be reverse engineered, so I would suggest to not hardcode database passwords or any other critical credentials.
I assume Slack, Discord and others don't hardcode their DB passwords in app. Their desktop app don't "talk" directly to database, it's talking with some server-side application. You as a user have to provide credentials to your account. Communication is done through API which implies various restrictions based on your user privileges. This server-side application decides what you can and what you cannot do and translates your requests into DB operations.
So using those apps you don't go even near to their DB passwords.
If you want to do client application which should be able to do some operations on DB, I would suggest the same, split this application into two parts: ClientApp and ServerApp.

Securing JS client-side SDKs

I'm working on a React-Redux web-app which integrates with AWS Cognito for user authentication/data storage and with the Shopify API so users can buy items through our site.
With both SDKs (Cognito, Shopify), I've run into an issue: Their core functionality attaches data behind the scenes to localStorage, requiring both SDKs to be run client-side.
But running this code entirely client-side means that the API tokens which both APIs require are completely insecure, such that someone could just grab them from my bundle and then authenticate/fill a cart/see inventory/whatever from anywhere (right?).
I wrote issues on both repos to point this out. Here's the more recent one, on Shopify. I've looked at similar questions on SO, but nothing I found addresses these custom SDKs/ingrained localStorage usage directly, and I'm starting to wonder if I'm missing/misunderstanding something about client-side security, so I figured I should just ask people who know more about this.
What I'm interested in is whether, abstractly, there's a good way to secure a client-side SDK like this. Some thoughts:
Originally, I tried to proxy all requests through the server, but then the localStorage functionality didn't work, and I had to fake it out post-request and add a whole bunch of code that the SDK is designed to take care of. This proved prohibitively difficult/messy, especially with Cognito.
I'm also considering creating a server-side endpoint that simply returns the credentials and blocks requests from outside the domain. In that case, the creds wouldn't be in the bundle, but wouldn't they be eventually scannable by someone on the site once that request for credentials has been made?
Is the idea that these secret keys don't actually need to be secure, because adding to a Shopify cart or registering a user with an application don't need to be secure actions? I'm just worried that I obviously don't know the full scope of actions that a user could take with these credentials, and it feels like an obvious best practice to keep them secret.
Thanks!
Can't you just put the keys and such in a .env file? This way nobody can see what keys you've got stored in there. You can then access your keys through process.env.YOUR_VAR
For Cognito you could store stuff like user pool id, app client id, identity pool id in a .env file.
NPM package for dotenv can be found here: NPM dotenv
Furthermore, what supersecret stuff are you currently storing that you're worried about? By "API tokens", do you mean the OpenId token which you get after authenticating to Cognito?
I can respond to the Cognito portion for this. Your AWS Secret Key and Access Key are not stored in the client. For your React.js app, you only need the Cognito User Pool Id and the App Client Id in your app. Those are the only keys that are exposed to the user.
I cover this in detail in a comprehensive tutorial here - http://serverless-stack.com/chapters/login-with-aws-cognito.html

Webrole is not starting and always busy

I am working on a Azure Web application. The code compiles and runs fine on my local machine. But when I upload the package in Azure Platform, the webrole wouldn't start and gives Busy status with the message: "Waiting for role to start... System is initializing. [2012-04-30T09:19:08Z]"
Both Onstart() and Run() don't contain any code. I am not blocking the return of OnStart.
However I am using window.setInterval in javascript. The javascript function retrieves the values from Database every 10 seconds.
What can be done to resolve this?
In most of the situation when a Role (Web or Worker) is stuck, I have found the following steps very useful:
Always add RDP access to your role because in some cases when role is stuck you still have ability to RDP your instance and investigate the problem by yourself. In some cases you could not RDP to your instance because dependency services are not ready yet to let you in. So if you have RDP enabled or could RDP to your instance please try to log you in.
Once you have RDP to your instance. Get the local machine IP address and launch directly in browser. In internal IP address starts from 10.x.x.x so you can open the browser based on your endpoint configuration i.e. http://10.x.x.x:80 or https://10.x.x.x:443
If you could not get into the instance over RDP then your best bet are to get the diagnostics info to understand where the problem could be. The diagnostics info is collected in your instance and sent to Azure Storage, configured by you in your WebRole.cs (in Web Role) or WorkerRole.cs (in Worker Role) code. Once diagnostics is working in your role, you can collect the Diagnostics data at your configured Azure Blob/Table storage to understand the issue.
If you don't have RDP access and don't have any Azure Diagnostics configured (or could not get any diagnostics data to understand the problem) your best bet is to contact Azure Support team (24x7 via web or phone) at the link below and they will have ability to access your instance (with your permission) and provide you root cause.
https://support.microsoft.com/oas/default.aspx?gprid=14928&st=1&wfxredirect=1&sd=gn
When contacting Azure Support, please provide your Subscription ID, Deployment ID, your Azure Live Account ID and a small description of your problem.
Two things to check:
Make sure your diagnostics connection string is pointing to a real account, and not dev storage
Check the session state provider in web.config. By default, it points to SQL Express LocalDB, which won't exist in Windows Azure.
The reason for this I run into the most is a missing or invalid assembly. But there are several great posts that can help disanosis this so I won't dive into the matter too deeply myself.
http://social.msdn.microsoft.com/Forums/en-US/windowsazuretroubleshooting/thread/6c739db9-4f6a-4203-bbec-eba70733ec16
http://blogs.msdn.com/b/tom/archive/2011/02/25/help-with-windows-azure-role-stuck-in-initializing-busy-stopped-state.aspx
http://blogs.staykov.net/2009/12/windows-azure-role-stuck-in.html

Categories

Resources