Retrieving a timestamp for an S3 upload - javascript

I want to handle synchronizing between browser cache (indexedDB) and S3. Therefore I utilize timestamps.
The tricky part is, that my browser application needs to know the exact "last update" timestamp of the file in S3 to store it alongside the locally cached file (so I can sense differences on the one or other side by timestamps being not equal).
Currently, my best solution is:
// Upload of file
var upload = new AWS.S3.ManagedUpload({
params: {
// some params
}
});
await upload.promise();
// Call of listObjectsV2
var s3Objects = await s3.listObjectsV2(params).promise();
// get "LastModified" value from listObjectsV2
I really dislike this solution as it makes an extra call for "listObjectsV2", that needs time and is charged by AWS.
From the top of my head, I expected there should be something in the return params of the upload, that I can utilize. But I can't find anything. What am I missing?

Looking at the documentation for the AWS SDK for JavaScript, I don't think you're missing anything at all: https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3/ManagedUpload.html#promise-property
It is simply not returning any date time field after a successfull upload.
(I've been searching for something like this myself, only for NET. In the end I had to start sending metadata requests after uploading.)
Perhaps listening to S3 events could be an alternative: https://aws.amazon.com/blogs/aws/s3-event-notification/

Related

Fastest redirects Javascript

My main function is I am creating a link-shortening app. When someone entered a long URL, it will give a short URL. If the user clicked on the short link it will search for the long URL on the DB and redirect it to the long URL.
Meantime I want to get the click count and clicked user's OS.
I am currently using current code :
app.get('/:shortUrl', async (req, res) => {
const shortUrl = await ShortUrl.findOne({short: req.params.shortUrl})
if (shortUrl == null) return res.sendStatus(404)
res.redirect(shortUrl.full)
})
findOne is finding the Long URL on the database using ShortID. I used mongoDB here
My questions are :
Are there multiple redirect methods in JS?
Is this method work if there is a high load?
Any other methods I can use to achieve the same result?
What other facts that matter on redirect time
What is 'No Redirection Tracking'?
This is a really long question, Thanks to those who invested their time in this.
Your code is ok, the only limitation is where you run it and mongodb.
I have created apps that are analytics tracker, handling billion rows per day.
I suggest you run your node code using AWS Beanstalk APP. It has low latency and scales on your needs.
And you need to put redis between your request and mongodb, you will call mongodb only if your data is not yet in redis. Mongodb has more read limitations than a straight redis instance.
Are there multiple redirect methods in JS?
First off, there are no redirect methods in Javascript. res.redirect() is a feature of the Express http framework that runs in nodejs. This is the only method built into Express, though all a redirect response consists of is a 3xx (often 302) http response status and setting the Location header to the redirect location. You can manually code that just as well as you can use res.redirect() in Express.
You can look at the res.redirect() code in Express here.
The main things it does are set the location header with this:
this.location(address)
And set the http status (which defaults to 302) with this:
this.statusCode = status;
Then, the rest of the code has to do with handling variable arguments, handling an older design for the API and sending a body in either plain text or html (neither of which is required).
Is this method work if there is a high load?
res.redirect() works just fine at a high load. The bottleneck in your code is probably this line of code:
const shortUrl = await ShortUrl.findOne({short: req.params.shortUrl})
And, how high a scale that goes to depends upon a whole bunch of things about your database, configuration, hardware, setup, etc... You should probably just test how many request/sec of this kind your current database can handle.
Any other methods I can use to achieve the same result?
Sure there are. But, you will have to use some data store to look up the shortUrl to find the long url and you will have to create a 302 response somehow. As said earlier, the scale you can achieve will depend entirely upon your database.
What other facts that matter on redirect time
This is pretty much covered above (hint, its all about the database).
What is 'No Redirection Tracking'?
You can read about it here on MDN.

How to send multiple files to server using WebSockets?

I'm trying to send multiple files from the client to the NodeJS server using WebSockets.
To send one file, I currently do the following:
// Client
let upload = document.getElementById('upload')
button.onclick = async function() {
let file = upload.files[0];
let byteFile = await getAsByteArray(file);
socket.send(byteFile);
}
async function getAsByteArray(file) {
return new Uint8Array(await readFile(file))
}
function readFile(file) {
return new Promise((resolve, reject) => {
let reader = new FileReader()
reader.addEventListener("loadend", e => resolve(e.target.result))
reader.addEventListener("error", reject)
reader.readAsArrayBuffer(file)
});
}
// Server
ws.on('message', function incoming(message) {
// This returns a buffer which is what I'm looking for when working with a single file.
console.log(message);
return;
}
This works great for one file. I'm able to use the buffer and process the file as I would like. To send two files, my thought was to convert each file to a Uint8Array (as I did for the single file) and push to an array like so:
// Client
let filesArray = [];
let files = upload.files; // Grab uploaded Manifests
for (let file of files) {
let byteFile = await getAsByteArray(file);
filesArray.push(byteFile);
}
socket.send(filesArray);
In the same way as with one file, the server returns a buffer for the array that was sent; however, I'm not sure how to work with it. I need each file to be their own buffer in order to work with them. Am I taking the wrong approach here? Or am I just missing some conversion to be able to work with each file?
This works great for one file.
Not really. Unless it is supposed to be used in some very simplistic setup, probably in an isolated (from the internet) network.
You literally send a sequence of bytes to the server which reads it and what is it going to do with it? Save it to disk? Without validating? But how can it validate a random sequence of bytes, it has no hint about what it is? Secondly, where will it save it? Under what name? You didn't send any metadata like filename. Is it supposed to generate a random name for it? How will the user know that this is his file? Heck, as it is you don't even know who sent that file (no authentication). Finally, what about security? Can I open a WebSocket connection to your server and spam it with arbitrary sequences of data, effictively killing it? You probably need some authentication, but even with it, can any user spam such upload? Maybe you additionally need tokens with timeouts for that (but then you have to think about how will your server issue such tokens).
I need each file to be their own buffer in order to work with them.
No, you don't. The bare minimum you need is (1) the ability to send files with metadata from the client and (2) the ability to read files with metadata on the server side. You most likely need some authentication mechanism as well. Typically you would use classical HTTP for that, which I strongly encourage you to utilize.
If you want to stick with WebSockets, then you have to implement those already well established mechanisms by yourself. So here's how I would do that:
(1) Define a custom protocol on top of WebSocket. Each frame should have a structure, for example first two bytes indicating "size of command", next X bytes (previous 2 bytes interpreted as int of size 16) the command as string. On the server side you read that command, map it to some handler, and run appropriate action. The data that the command should process, is the data from the remaining bytes of the frame.
(2) Setup authentication. Not in the scope of this answer, just indicating it is crucial. I'm putting this after (1) because you can reuse the protocol for that.
(3) Whenever you want to upload a file: send a command "SEND" to the server. In the same frame, after "SEND" command put metadata (file name, size, content type, etc.), you can encode it as JSON prefixed with length. Afterwards put the content of the file in the buffer.
This solution should obviously be refined with (mentioned earlier) tokens. For proper responsivness and concurrency, you should probably split large files into separate WebSocket frames (which complicates the design a lot).
Anyway, as you can see, the topic is far from trivial and requires lots of experience. And it is basically reimplementing what HTTP does anyway. Again: I strongly suggest you use plain old HTTP.
Send each buffer in separate message:
button.onclick = async function() {
upload.files.forEach(file => socket.send(await getAsByteArray(file)));
}

Setting limits on file uploads via Firebase auth and storage without server in the middle?

I'm learning about Firebase auth and storage in a web app. My idea asks users to login via Firebase and then upload an image.
I can see that this is possible from Firebase auth and storage. However, I would like to put limits on the file count and file-size they can upload.
Is it possible to control uploads within the Firebase console (or somewhere else)? After reviewing the JavaScript examples, I see how I can put files in, and I can imagine writing code which would query Firebase for a user's upload count, and then limit on the client side, but of course, this is a completely insecure method.
If I hosted this as a single page app on, say, GitHub pages, I am wondering if I could set these limits without involving a server. Or, do I need to proxy my uploads through a server to make sure I never allow users to upload more than I intend them to?
You can limit what a user can upload through Firebase Storage's security rules.
For example this (from the linked docs) is a way to limit the size of uploaded files:
service firebase.storage {
match /b/<your-firebase-storage-bucket>/o {
match /images/{imageId} {
// Only allow uploads of any image file that's less than 5MB
allow write: if request.resource.size < 5 * 1024 * 1024
&& request.resource.contentType.matches('image/.*');
}
}
}
But there is currently no way in these rules to limit the number of files a user can upload.
One approach that comes to mind would be to use fixed file names for that. For example, if you limit the allowed file names to be numbered 1..5, the user can only ever have five files in storage:
match /public/{userId}/{imageId} {
allow write: if imageId.matches("[1-5]\.txt");
}
If you need a per user storage validation the solution is a little bit more tricky, but can be done.
Ps.: You will need to generate a Firebase token with Cloud Functions, but the server won't be in the middle for the upload...
https://medium.com/#felipepastoree/per-user-storage-limit-validation-with-firebase-19ab3341492d
One solution may be is to use Admin SDK to change Storage Rules based on a Firestore document holding the upload count per day.
Say you have a firestore collection/document as userUploads/uid having fields uploadedFiles: 0 and lastUploadedOn.
Now, once the user uploads the file to Firebase Storage (assuming within limits and no errors), you can trigger Cloud Function which will read userUploads/uid document and check the lastUploadedOn field is of an earlier date than the currently uploaded file's date and if yes then make the uploadedFiles to 1 and change the lastUploadedOn to uploaded datetime. Else, increment the uploadedFiles count and change lastUpdateOn to current datetime. Once the uploadedFiles value becomes 10 (your limit), you can change the storage rules using Admin SDK. See example here. Then, change the count to 0 in userUploads/uid document.
However, there a little caveat. The change in rules might take some time and there should be no legit async work under process for that rule. From Admin SDK:
Firebase security rules take a period of several minutes to fully deploy. When using the Admin SDK to deploy rules, make sure to avoid race conditions in which your app immediately relies on rules whose deployment is not yet complete
I haven't tried this myself but it looks like it will work. On a second thought, changing back the rules to allow write could be complicated. If the user uploads on the next day (after rules has been changed), the upload error handler can trigger another cloud function to check if it is a legit request, change the rules back to normal and attempt the upload again after sometime but it will be very bad user experience. On the other hand, if you use a scheduler cloud function to check userUploads/uid document everyday and reset values, it could be costly (~$18 per million users per month # $0.06/100K reads) and it may be complicated if users are in different timezones and it may be irrelevant regarding most users depending on they're uploading that frequently. Furthermore, rules have limits
Rules must be smaller than 64 KiB of UTF-8 encoded text when serialized
A project can have at most 2500 total deployed rulesets. Once this limit is reached, you must delete some old rulesets before creating new ones.
So per user rule for a large user base can easily reach this limit (apart from other rules).
Perhaps the optimum solution could be to use Auth Claims. Originally have a deny write rule if user has a particular auth claim token (say canUpload: false). Then in cloud function triggered on upload, attach this claim when the user reached limit. This will be real-time as it immediately blocks the user as oppose to Admin SDK rules deployment delay.
To remove the auth claim:
Check through another cloud function in the upload error handler if the lastUploadedOn has been changed hence removing the claim
Check through a separate cloud function called before upload that checks if the user has auth claim and the lastUploadedOn is an earlier date, then remove the claim
Additionally, during login, it can be checked and removed if lastUploadedOn is earlier than today but it is less efficient than 2 since it would constitute unnecessary and needless read on firestore while the user is not even uploading anything
In 2, if the client tries to skip the call, and has the auth claim, s/he cannot upload ever as blocked by security rule. Otherwise if no auth claim then s/he will go through the normal process.
Note: Changing auth claims needs to be pushed to the client. See this doc.
Following the filenames hack Frank gave us, I think we can improve on that to make it more flexible.
For example, in my case I don't want to put a hard limit on user uploads like "you can upload up to 50 files, ever", but rather "you're allowed to upload up to 20 files per day".
I just had this idea and will work on the implementation soon enough, but here it goes:
Following the same logic, we can allow only filenames like 1-07252022, 2-07252022, etc.
And since Firebase rules handles us some string and timestamp methods, I think we can achieve this upload limit/day only using Storage Rules, without the need for user custom claims or any cloud function.
Although in my case, I only want to allow uploads from paying customers, so in that case I would need also a custom claim on the user's token.
I'll edit this answer when I work on the code snippet, but anyone struggling, here you have an idea.
One way to limit number of files (or storage size) a user can upload is to use signed URLs . You would need a server (Cloud Functions) to generate signed URLs but then you can upload large files directly to Cloud storage without streaming the file through the server. The flow would be:
Send the file names and sizes to your server in the request body
Generate signed URL for each file and set Content-Length equal to size of file so that user can only upload a file of that size using the URL.
Update user's storage usage in a database like Firestore.
Upload the files to Cloud storage using the signed URLs received from server.
You just need to ensure that user has enough storage available by checking their Firestore document before generating the signed URLs. If not, you can return an error like:
// storageLimit
if (storageUsed + size > storageLimit) {
throw new functions.https.HttpsError(
"failed-precondition",
"Not enough storage available"
);
}
Checkout How to set maximum storage size limit per euser in Google Cloud Storage? for detailed explanation and code snippets.

Google OAuth WildCard Domains

I am using the google auth but keep getting an origin mismatch. The project I am working has sub domains that are generated by the user. So for example there can be:
john.example.com
henry.example.com
larry.example.com
In my app settings I have one of my origins being http://*.example.com but I get an origin mismatch. Is there a way to solve this? Btw my code looks like this:
gapi.auth.authorize({
client_id : 'xxxxx.apps.googleusercontent.com',
scope : ['https://www.googleapis.com/auth/plus.me',
state: 'http://henry.example.com',
'https://www.googleapis.com/auth/userinfo.email', 'https://www.googleapis.com/auth/userinfo.profile'],
immediate : false
}, function(result) {
if (result != null) {
gapi.client.load('oath2', 'v2', function() {
console.log(gapi.client);
gapi.client.oauth2.userinfo.get().execute(function(resp) {
console.log(resp);
});
});
}
});
Hooray for useful yet unnecessary workarounds (thanks for complicating yourself into a corner Google)....
I was using Google Drive using the javascript api to open up the file picker, retrieve the file info/url and then download it using curl to my server. Once I finally realized that all my wildcard domains would have to be registered, I about had a stroke.
What I do now is the following (this is my use case, cater it to yours as you need to)
On the page that you are on, create an onclick event to open up a new window in a specific domain (https://googledrive.example.com/oauth/index.php?unique_token={some unique token}).
On the new popup I did all my google drive authentication, had a button to click which opened the file picker, then retrieved at least the metadata that I needed from the file. Then I stored the token (primary key), access_token, downloadurl and filename in my database (MySQL).
Back on step one's page, I created a setTimeout() loop that would run an ajax call every second with that same unique_token to check when it had been entered in the database. Once it finds it, I kill the loop and then retrieve the contents and do with them as I will (in this case I uploaded them through a separate upload script that uses curl to fetch the file).
This is obviously not the best method for handling this, but it's better than entering each and every subdomain into googles cloud console. I bet you can probably do this with googles server side oauth libraries they use, but my use case was a little complicated and I was cranky cause I was frustrated at the past 4 days I've spent on a silly little integration with google.
Wildcard origins are not supported, same for redirect URIs.
The fact that you can register a wildcard origin is a bug.
You can use the state parameter, but be very careful with that, make sure you don't create an open redirector (an endpoint that can redirect to any arbitrary URL).

Accessing IndexedDB from multiple javascript threads

Overview:
I am trying to avoid a race condition with accessing an IndexedDB from both a webpage and a web-worker.
Setup:
Webpage that is saving items to the local IndexedDB as the user works with the site. Whenever a user saves data to the local DB the record is marked as "Unsent".
Web-worker background thread that is pulling data from the IndexedDB, sending it to the server and once the server receives it, marking the data in the IndexedDB as "Sent".
Problem:
Since access to the IndexedDB is asynchronous, I can not be guaranteed that the user won't update a record at the same time the web-worker is sending it to the server. The timeline is shown below:
Web-worker gets data from DB and sends it to the server
While the transfer is happening, the user updates the data saving it to the DB.
The web-worker gets the response from the server and then updates the DB to "Sent"
There is now data in DB that hasn't been sent to the server but marked as "Sent"
Failed Solution:
After getting the response from the server, I can recheck to row to see if anything has been changed. However I am still left with a small window where data can be written to the DB and it will never be sent to the server.
Example:
After server says data is saved, then:
IndexedDB.HasDataChanged(
function(changed) {
// Since this is async, this changed boolean could be lying.
// The data might have been updated after I checked and before I was called.
if (!changed){
IndexedDB.UpdateToSent() }
});
Other notes:
There is a sync api according to the W3 spec, but no one has implemented it yet so it can not be used (http://www.w3.org/TR/IndexedDB/#sync-database). The sync api was designed to be used by web-workers, to avoid this exact situation I would assume.
Any thoughts on this would be greatly appreciated. Have been working on it for about a week and haven't been able to come up with anything that will work.
I think I found a work around for this for now. Not really as clean as I would like, but it seems to be thread safe.
I start by storing the datetime into a LastEdit field, whenever I update the data.
From the web-worker, I am posting a message to the browser.
self.postMessage('UpdateDataSent#' + data.ID + '#' + data.LastEdit);
Then in the browser I am updating my sent flag, as long as the last edit date hasn't changed.
// Get the data from the DB in a transaction
if (data.LastEdit == lastEdit)
{
data.Sent = true;
var saveStore = trans.objectStore("Data");
var saveRequest = saveStore.put(data);
console.log('Data updated to Sent');
}
Since this is all done in a transaction in the browser side, it seems to work fine. Once the browsers support the Sync API I can throw it all away anyway.
Can you use a transaction?
https://developer.mozilla.org/en/IndexedDB/IDBTransaction
Old thread but the use of a transaction would solve the Failed Solution approach. I.e. the transaction only needs to span the check that the data in the IndexedDB hasn't change after the send and marking it as sent if there was no change. If there was a change, the transaction ends without writing.

Categories

Resources