I'm creating a REST backend using ExpressJS. A part of the backend allows users to upload file assets, which should only exist for 10 minutes.
It safe to use setTimeout, to delete the file after 10 minutes, or are there better ways of doing this in NodeJS? How can I ensure the file is deleted? Here is my current solution:
router.post('/upload', fileUpload.single('asset'), (req, res) => {
// Do something with the file
res.status(201).end();
setTimeout(() => {
// Delete the file
}, 600000);
});
Your approach is not going to scale, as it will potentially create a huge number of delayed methods. Better way is to store the information about the file into some associative array (object) such as :
{ "path":"date_uploaded" }
And to check every, say, XX seconds if there is anything that needs deleting with setInterval(). I.e. go through the whole structure and see if anythign was created more then 10 minutes ago. For all such elements you delete the file and remove it from the array.
var pending = {};
router.post('/upload', fileUpload.single('asset'), (req, res) => {
// Do something with the file
pending[file_name] = (new Date()).getTime();
res.status(201).end();
});
setInterval(
function(){
// check each element of pending and see if it needs to be deleted
}, 30000);
Related
I wanted to swap a profile picture of a user. For this, I have to check the database to see if a picture has already been saved, if so, it should be deleted. Then the new one should be saved and entered into the database.
Here is a simplified (pseudo) code of that:
async function changePic(user, file) {
// remove old pic
if (await database.hasPic(user)) {
let oldPath = await database.getPicOfUser(user);
filesystem.remove(oldPath);
}
// save new pic
let path = "some/new/generated/path.png";
file = await Image.modify(file);
await Promise.all([
filesystem.save(path, file),
database.saveThatUserHasNewPic(user, path)
]);
return "I'm done!";
}
I ran into the following problem with it:
If the user calls the API twice in a short time, serious errors occur. The database queries and the functions in between are asynchronous, causing that the changes of the first API call weren't applied when the second API checks for a profile pic to delete. So I'm left with a filesystem.remove request for an already unexisting file and an unremoved image in the filesystem.
I would like to safely handle that situation by synchronizing this critical section of code. I don't want to reject requests only because the server hasn't finished the previous one and I also want to synchronize it for each user, so users aren't bothered by the actions of other users.
Is there a clean way to achieve this in JavaScript? Some sort of monitor like you know it from Java would be nice.
You could use a library like p-limit to control your concurrency. Use a map to track the active/pending requests for each user. Use their ID (which I assume exists) as the key and the limit instance as the value:
const pLimit = require('p-limit');
const limits = new Map();
function changePic(user, file) {
async function impl(user, file) {
// your implementation from above
}
const { id } = user // or similar to distinguish them
if (!limits.has(id)) {
limits.set(id, pLimit(1)); // only one active request per user
}
const limit = limits.get(id);
return limit(impl, user, file); // schedule impl for execution
}
// TODO clean up limits to prevent memory leak?
I am creating a route where a data attribute called "active" is being set to true, but after an hour is set to false. I am wondering if it is possible or bad practice to call a settimeout function in the express callback. Such as;
app.get("/test", (req, res) => {
//Some code
SetTimeout(func, 3600);
});
Is this bad for scalling, if this route was hit many times would it be very expensive? Thanks in advance.
If you store those values in a database, then you should not create a timer per entry in node that will reset this value. Especially it if it is a lang lasting timer, Session like/related data that should last longer then a few seconds should in general not be keept in the memory of the node process.
The more frequently your site is visited, the more likely it is that you have at least one timer running at any time. As soon as this is the case you are not able to restart the application without either loosing that timer. Or you need to wait until all timers are finished and while that don't accept any new ones.
And you cannot switch to clustered mode, because then if one user calls that route twice, it might end up in two different processes, each of those processes would not know of the timeouts the other process has set.
So a better idea is to add a timestamp into the database, and one cleanup timer responsible for all entries.
It seems you only need to set 1 timer. This assumes the 'hour' starts at the first request.
let timer = null
let data = true
app.get("/test", (req, res) => {
//Some code
if (!timer) {
timer = setTimeout(() => {data=false}, 3600);
}
});
Instead, for multiple users, you can avoid setting multiple timers by putting a timestamp in a hash and polling it per request or a separate interval timer.
// init
let timers = {}
// in request
if (!timers[user]) {
timers[user] = new Date().getTime() / 1000 + 3600
}
else if (timers[user] <= new Date().getTime() / 1000)
{
// update db, etc
}
// or poll for expirations in separate single timer routine
let now = new Date().getTime() / 1000
Object.keys(timers).forEach(user => {
if (timers[user] <= now) {
// update db, etc
}
})
The problem:
I want to keep track of my uploaded files by writing the fileinformation each uploaded file for a multi upload into my database. However when I upload 2 files it usually creates 3 entries in the database and when I upload 6 files it will create a lot more than 6 entries.
My db function:
function saveAssetInDatabase(project, fileInformation) {
return new Promise((reject, resolve) => {
let uploaded_file = {}
uploaded_file = fileInformation
uploaded_file.file_type = 'asset'
uploaded_file.display_name = fileInformation.originalname
project.uploaded_files.push(uploaded_file)
project.save()
})
}
The simplified code which calls the function:
for(var i=0; i<req.files["sourceStrings"].length; i++) {
// Unknown file format, let's save it as asset
saveAssetInDatabase(project, fileInformation).then(result => {
return res.status(200).send()
}).catch(err => {
logger.error(err)
return res.status(500).send()
})
}
I guess that there is something wrong with my db function as it leads to duplicate file entries. What am I doing wrong here? One file should get one entry.
If I read the specs on model.save correctly on the mongoose website, the problem with your save is rather that you are always reusing the original project, and not the newly saved project that should contain the latest state.
So what you are essentially are doing:
project.files.push(file1);
// file1 is marked as new
project.save();
project.files.push(file2);
// file1 & file2 are marked as new
// (the project doesn't know file1 has been saved already)
// ...
Now, that actually brings quite some advantages, since you are doing currently a save per file, while you could save all files at once ;)
I guess the easiest way would be to put the project.save method outside of your for loop and change your first method like
function saveAssetInDatabase(project, fileInformation) {
let uploaded_file = {};
uploaded_file = fileInformation;
uploaded_file.file_type = 'asset';
uploaded_file.display_name = fileInformation.originalname;
project.uploaded_files.push(uploaded_file);
}
with the for loop changed to
function saveSourceString(project, req) {
for(var i=0; i<req.files["sourceStrings"].length; i++) {
// Unknown file format, let's save it as asset
saveAssetInDatabase(project, fileInformation);
}
// save after all files were added
return project.save().then(result => {
return res.status(200).send()
}).catch(err => {
logger.error(err)
return res.status(500).send()
});
}
Note that project.save() will return a promise, with an argument containing the newly saved project. If you wish to manipulate this object at a later time, make sure you take the saved file, and not, as you have done till now, the non-saved model
Problem
Each time in your for loop create a promise then send your that time project
object. It's not correct way. Each promise resolved you have send project object to DB then store it.
For example you have 3 asset details.
While first time loop running first asset data would store in project object and the promise resolved you have send that time project in your DB store it. This time project object has first asset details.
While second time loop running second asset data would store in project object with first asset data and the promise resolved you have send that time project in your DB store it. This time project object has first and second asset details.
While third time loop running third asset data would store in project object with first and second asset data and the promise resolved you have send that time project in your DB store it. This time project object has first, second and third asset details.
So you have store same datas in your DB.
Solution
You have use Promise.all. Resolve all assest's promise after you store your project data in your DB.
// your DB function
function saveAssetInDatabase(project, fileInformation) {
return new Promise((resolve, reject) => {
let uploaded_file = {}
uploaded_file = fileInformation
uploaded_file.file_type = 'asset'
uploaded_file.display_name = fileInformation.originalname
project.uploaded_files.push(uploaded_file)
project.save();
resolve();
})
}
// calls function
let promiseArray = [];
for(var i=0; i<req.files["sourceStrings"].length; i++) {
promiseArray.push(saveAssetInDatabase(project, fileInformation));
}
Promise.all(promiseArray).then(result => {
return res.status(200).send();
}).catch(err => {
logger.error(err)
return res.status(500).send()
})
}
I'm looking to implement a solution where I can query the Mongoose Database on a regular interval and then store the results to serve to my clients.
I'm assuming this will reduce my response time when my users pull the collection.
I attempted to implement this plan by creating an empty global object and then writing a function that queries the db and then stores the results as the global object mentioned previously. At the end of the function I setTimeout for 60 seconds and then ran the function again. I call this function the first time the server controller gets called when the app is first run.
I then set my clients up so that when they requested the collection, it would first look to see if the global object exists, and if so return that as the response. I figured this would cut my 7-10 second queries down to < 1 sec.
In my novice thinking I assumed that Nodejs being 'single-threaded' something like this could work quite well - but it just seemed to eat up all my RAM and cause fatal errors.
Am I on the right track with my thinking or is it better to query the db every time people pull the collection?
Here is the code in question:
var allLeads = {};
var getAllLeads = function(){
allLeads = {};
console.log('Getting All Leads...');
Lead.find().sort('-lastCalled').exec(function(err, leads) {
if (err) {
console.log('Error getting leads');
} else {
allLeads = leads;
}
});
setTimeout(function(){
getAllLeads();
}, 60000);
};
getAllLeads();
Thanks in advance for your assistance.
I have a set of records that I would like to update sequentially in perpetuity. Basically:
Get least recently updated record
Update record
Set date of record to now (aka. send it to the back of the list)
Back to step 1
Here is what I was thinking using Firebase:
// update record function
var updateRecord = function() {
// get least recently updated record
firebaseOOO.limit(1).once('value', function(snapshot) {
key = _.keys(snapshot.val())[0];
/*
* do 1-5 seconds of non-Firebase processing here
*/
snapshot.ref().child(key).transaction(
// update record
function(data) {
return updatedData;
},
// update priority after commit (would like to do it in transaction)
function(error, committed, snap2) {
snap2.ref().setPriority(snap2.dateUpdated);
}
);
});
};
// listen whenever priority changes (aka. new item needs processing)
firebaseOOO.on('child_moved', function(snapshot) {
updateRecord();
});
// kick off the whole thing
updateRecord();
Is this a reasonable thing to do?
In general, this type of daemon is precisely what was envisioned for use with the Firebase NodeJS client. So, the approach looks good.
However, in the on() call it looks like you're dropping the snapshot that's being passed in on the floor. This might be application specific to what you're doing, but it would be more efficient to consume that snapshot in relation to the once() that happens in the updateRecord().