I have a problem where I've got 20+k rows in a csv file and I'm trying to update them based on documents of a matching field in a Mongo DB that contains 350k docs.
The trick is that I need to perform some logic on the matches and then re-update the csv.
I'm using PapaParse to parse/unparse the csv file
Doing something like works to get all my matches
const file = fs.createReadStream('INFO.csv');
Papa.parse(file, {
header: true,
complete: function(row) {
getMatchesAndSave(row.data.map(el => { return el.fieldToMatchOn }));
}
});`
function getMatchesAndSave(fields) {
Order.find({fieldToMatchOn: { $in: fields}}, (err, results) => {
if (err) return console.error(err);
console.log(results);
});
}
That gets me matches fast. However, I can't really merge my data back into the csv bc the csv has a unique key column that Mongo has no idea about.
So all the data is really dependent of what's in the csv.
Therefore I thought of doing something like this
`
const jsonToCSV = [];
for (let row of csvRows) {
db.Collection.find({fieldToMatchOn: row.fieldToMatchOn}, (err, result) => {
//Add extra data to row based on result
row.foo = result.foo;
//push to final output
jsonToCSV.push(row);
}
}
papa.unparse(jsonToCSV);
//save csv to file
The issue with the above implementation (as terribly inefficient as it may seem) - is that the Find calls are asynchronous and nothing gets pushed to jsonToCSV.
Any tips? Solving this with $in would be ideal, are there any ways to access the current element in the $in (so looking for the iterator)..that way I could process on that.
You can try async/await to iterate csvRows array, like this:
const search = async () => {
const jsonToCSV = await Promise.all(csvRows.map(async row => {
/* find returns a promise, so we can use await, but to use await is
mandatory use it inside an async function. Map function not returns a
promise, so this can be solve using Promise.all. */
try {
const result = await db.Collection.find({ fieldToMatchOn: row.fieldToMatchOn });
row.foo = result.foo;
return row;
} catch (e) {
// do somenthing if error
}
}));
papa.unparse(jsonToCSV);
}
// call search function
search();
Check this https://flaviocopes.com/javascript-async-await-array-map to a better understanding.
Related
I'm new on MEAN stack and also on JS. What I'm trying to accomplish is to adapt the response that I get from the DB adding to it another field.
I have a mongoose method that gave me all the Courses that exist and I want to add to that information all the Inscriptions for each one. So I'm trying this:
exports.getAllCourses = async(req, res) => {
try {
const rawCourses = await Course.find();
const courses = await courseAdapter.apply(rawCourses)
await res.json({courses});
} catch (error) {
console.log(error);
res.status(500).send("Ocurrio un error imprevisto :/");
}
};
My courseAdapter
exports.apply = (courses) => {
return courses.map(async course=> (
{
...course._doc,
number: await coursetUtils.getNumberOfInscriptions(course._doc._id)
}
));
}
And my courseUtils:
exports.getNumberOfInscriptions = async courseId => {
return await CourseInscription.countDocuments({courseId: courseId});
}
I think my problem is with the async-await function because with this code i get this:
{"courses":[
{},
{}
]}
or changing some stuff i get this:
{"courses":[
{"courseInfo":{...},
"number":{}
},
{"courseInfo":{...},
"number":{}
}
]}
But never the number of inscription on the response. By the way i use function getNumberOfInscriptions() in other part of my code for make a validation and works.
Trying a lot of stuff i get to this:
I change the way I process the data from DB in the apply function and I treat it like an array.
exports.apply = async (courses) => {
var response = [];
for (let c of courses) {
var doc = c._doc;
var tmp = [{course: doc, inscriptionNumber: await courseUtils.getNumberOfInscriptions(c._doc._id)}];
response = response.concat(tmp);
}
return response;
}
I think is not a pretty good way to accomplish my goal, but it works. If I find something better, performance or clean I will posted.
Anyways I still don't know what I was doing wrong on my previous map function when I call my async-await function. If anybody knows, please let me know.
I am trying to seed the following data to my MongoDB server:
const userRole = {
role: 'user',
permissions: ['readPost', 'commentPost', 'votePost']
}
const authorRole = {
role: 'author',
permissions: ['readPost', 'createPost', 'editPostSelf', 'commentPost',
'votePost']
}
const adminRole = {
role: 'admin',
permissions: ['readPost', 'createPost', 'editPost', 'commentPost',
'votePost', 'approvePost', 'approveAccount']
}
const data = [
{
model: 'roles',
documents: [
userRole, authorRole, adminRole
]
}
]
When I try to iterate through this object / array, and to insert this data into the database, I end up with three copies of 'adminRole', instead of the three individual roles. I feel very foolish for being unable to figure out why this is happening.
My code to actually iterate through the object and seed it is the following, and I know it's actually getting every value, since I've done the console.log testing and can get all the data properly:
for (i in data) {
m = data[i]
const Model = mongoose.model(m.model)
for (j in m.documents) {
var obj = m.documents[j]
Model.findOne({'role':obj.role}, (error, result) => {
if (error) console.error('An error occurred.')
else if (!result) {
Model.create(obj, (error) => {
if (error) console.error('Error seeding. ' + error)
console.log('Data has been seeded: ' + obj)
})
}
})
}
}
Update:
Here is the solution I came up with after reading everyone's responses. Two private functions generate Promise objects for both checking if the data exists, and inserting the data, and then all Promises are fulfilled with Promise.all.
// Stores all promises to be resolved
var deletionPromises = []
var insertionPromises = []
// Fetch the model via its name string from mongoose
const Model = mongoose.model(data.model)
// For each object in the 'documents' field of the main object
data.documents.forEach((item) => {
deletionPromises.push(promiseDeletion(Model, item))
insertionPromises.push(promiseInsertion(Model, item))
})
console.log('Promises have been pushed.')
// We need to fulfil the deletion promises before the insertion promises.
Promise.all(deletionPromises).then(()=> {
return Promise.all(insertionPromises).catch(()=>{})
}).catch(()=>{})
I won't include both promiseDeletion and promiseInsertion as they're functionally the same.
const promiseDeletion = function (model, item) {
console.log('Promise Deletion ' + item.role)
return new Promise((resolve, reject) => {
model.findOneAndDelete(item, (error) => {
if (error) reject()
else resolve()
})
})
}
Update 2: You should ignore my most recent update. I've modified the result I posted a bit, but even then, half of the time the roles are deleted and not inserted. It's very random as to when it will actually insert the roles into the server. I'm very confused and frustrated at this point.
You ran into a very common problem when using Javascript: You shouldn't define (async) functions in a regular for (-in) loop. What happens, is that while you loop through the three values the first async find is being called. Since your code is async, nodejs does not wait for it to finish, before it continues to the next loop iteration and counts up to the third value, here the admin rule.
Now, since you defined your functions in the loop, when the first async call is over, the for-loop already looped to the last value, which is why admin is being inserted three times.
To avoid this, you can just move the async functions out of the loop to force a call by value rather than reference. Still, this can bring up a lot of other problems, so I'd recommend you to rather have a look at promises and how to chain them (e.g. Put all mongoose promises in an array and the await them using Promise.all) or use the more modern async/await syntax together with the for-of loop that allows for both easy readability as well as sequential async command instructions.
Check this very similar question: Calling an asynchronous function within a for loop in JavaScript
Note: for-of is being discussed as to performance heavy, so check if this applies to your use-case or not.
When using async functions in loops could cause some problems.
You should change the way you work with findOne to make it synchronous function
First you need to set your function to async, and then use the findOne like so:
async function myFucntion() {
let res = await Model.findOne({'role':obj.role}).exec();//Exec will fire the function and give back a promise which the await can handle.
//do what you need to do here with the result..
}
I receive an object bigListFromClient that includes an arbitrary number of objects each of which may have an arbitrary number of children. Every object needs to be entered into my database, but the DB needs to assign each of them a unique ID and child objects need to have the unique ID of their parents attached to them before they are sent off to the DB.
I want to create some sort of Promise or other calling structure that would call itself asynchronously until it reached the last object in bigListFromClient but I'm having trouble figuring out how to write it.
for(let i = 0; i < bigListFromClient.length; i++){
makeDbCallAsPromise(bigListFromClient[i].queryString, console.log); //I'm not just accepting anything from a user here, but how I get my queryString is kind of out of scope for this question
for(let j = 0; j < bigListFromClient[i].children.length; j++){
//the line below obviously doesn't work, I'm trying to figure out how to do this with something other than a for loop
makeDbCallAsPromise(bigListFromClient[i].children[j].queryString + [the uniqueID from the DB to insert this correctly as a child], console.log);
}
}
//this promise works great
makeDbCallAsPromise = function(queryString){
return new Promise((resolve, reject) => {
connection = mysql.createConnection(connectionCredentials);
connection.connect();
query = queryString;
connection.query(query, function (err, rows, fields) {
if (!err) {
resolve(rows);
} else {
console.log('Error while performing Query.');
console.log(err.code);
console.log(err.message);
reject(err);
}
});
connection.end();
})
};
My attempts at solving this on my own are so embarrassingly bad that even describing them to you would be awful.
While I could defer all the calls to creating children until the parents have been created in the DB, I wonder if the approach I've described is possible.
There are essentially two ways to do this. One is making the database calls sequential and the other one is making the calls parallel.
Javascript has a built-in function for parallel called Promise.all, you pass it an array of Promise instances and it returns a Promise instance containing the array.
In your case your code would look like this:
const result = Promise.all(
bigListFromClient.map(item =>
makeDbCallAsPromise(item.queryString).then(result =>
Promise.all(
item.children.map(item =>
makeDbCallAsPromise(item.queryString + [result.someId])
)
)
])
})
result will now contain a Promise that resolves to an array of arrays. These arrays contain the result of intserting children.
Using a more modern approach (with async await), sequential and with all results in a flat array:
const result = await bigListFromClient.reduce(
async (previous, item) => {
const previousResults = await previous
const result = await makeDbCallAsPromise(item.queryString)
const childResults = await item.children.reduce(
async (result, item) =>
[...(await result), await makeDbCallAsPromise(item.queryString + [result.someId])],
[]
)
return [...previousResults, result, ...childResults)
]),
[]
})
Depending on what you want to achieve and how you want to structure your code you can pick and choose from the different approaches.
For this sort of operation, try looking into bulk inserting. If you are intent on performing a single DB query/transaction per iteration, loop recursively over each parent and/or execute the same procedure for each child.
const dbCall = async (elm) => {
elm.id = Math.random().toString(36).substring(7)
if (elm.children) {
await Promise.all(elm.children.map(child => {
child.parentId = elm.id
return dbCall(child)
}))
}
return elm
}
const elms = [
{
queryString: '',
children: [
{
queryString: ''
}
]
}
]
Promise.all(elms.map(dbCall)).then(elm => /* ... */)
this is my second Node project. I am using Node, Express, MySQL.
What I am doing is, I have an array of names of users that have posted something, I then loop over those names and for each of them I do a connection.query to get their posts and I store those into an array(after that I do some other data manipulation to it, but that's not the important part)
The problem is: my code tries to do that data manipulation before it even receives the data from the connection.query!
I google-d around and it seems async await is the thing I need, problem is, I couldn't fit it in my code properly.
// namesOfPeopleImFollowing is the array with the names
namesOfPeopleImFollowing.forEach(function(ele){
connection.query(`SELECT * FROM user${ele}posts`, function(error,resultOfThis){
if(error){
console.log("Error found" + error)
} else {
allPostsWithUsername.push([{username:ele},resultOfThis])
}
})
})
console.log(JSON.stringify(allPostsWithUsername)) // This is EMPTY, it mustn't be empty.
So, how do I convert that into something which will work properly?
(Incase you need the entire code, here it is: https://pastebin.com/dDEJbPfP though I forgot to uncomment the code)
Thank you for your time.
There are many ways to solve this. A simple one would be to wrap your function inside a promise and resolve when the callback is complete.
const allPromises = [];
namesOfPeopleImFollowing.forEach((ele) => {
const myPromise = new Promise((resolve, reject) => {
connection.query(`SELECT * FROM user${ele}posts`, (error, resultOfThis) => {
if (error) {
reject(error);
console.log(`Error found${error}`);
} else {
resolve({ username: ele });
}
});
});
allPromises.push(myPromise);
});
Promise.all(allPromises).then((result) => {
// your code here
})
You can read more about promise.all here
I am working with node. I have an array of ids. I want to filter them based on a response of a call of other API. So i want to populate each id and know if they assert or not the filter i am doing based on the API.
I am using async/await. I found that the best approach is using Promises.all, but this is not working as expected. What i am doing wrong?
static async processCSGOUsers (groupId, parsedData) {
let steamIdsArr = [];
const usersSteamIds = parsedData.memberList.members.steamID64;
const filteredUsers = await Promise.all(usersSteamIds.map(async (userId) => {
return csGoBackpack(userId).then( (response) => {
return response.value > 40;
})
.catch((err) => {
return err;
});
}));
Object.keys(usersSteamIds).forEach(key => {
steamIdsArr.push({
steam_group_id_64: groupId,
steam_id_64: usersSteamIds[key]
});
});
return UsersDao.saveUsers(steamIdsArr);
}
Apart from that, it is happening something weird. When i was debbuging this, data parameters on this method is coming fine. When i reach on the line of the Promise.all i got a "reference error" on each parameter. I do not why.
Wait for all responses, then filter based on the results:
const responses = await Promise.all(usersSteamIds.map(csGoBackpack));
// responses now contains the array of responses for each user ID
// filter the user IDs based on the corresponding result
const filteredUsers = usersSteamIds.filter((_, index) => responses[index].value > 40);
If you don't mind using a module, you can do this kind of stuff in a straightforward way using these utilities