Firestore - get large collection and parse. Request is aborted - javascript

I've had a look around but can't find an obvious solution.
I have a collection with 130k documents. I need to export these as a CSV file. (The CSV part I have sorted I think).
My code works fine with smaller collection but when trying it on the 130k documents in a collection it hangs and I get "Request Aborted". What would be the best way to handle this?
My code:
db.collection("games")
.doc(req.params.docid)
.collection("players")
.onSnapshot(snapshot => {
console.log("On Snapshot")
snapshot.docs.forEach(data => {
const doc = data.data();
downloadArray.push(doc);
});
jsonexport(downloadArray, function(err, csv) {
if (err) return console.log(err);
fs.writeFile("out.csv", csv, function() {
res.sendFile(path.join(__dirname, "../out.csv"), err => {
console.log(err);
});
});
});
});
I'm trying out pagination as suggested, however I'm having trouble understanding how to keep calling the next batch until the loop is done, as sometimes I won't know the collection size, and querying such a large collection size takes over 1-2 minutes.
let first = db
.collection("games")
.doc(req.params.docid)
.collection("players")
.orderBy("uid")
.limit(500);
let paginate = first.get().then(snapshot => {
// ...
snapshot.docs.map(doc => {
console.log(doc.data());
});
// Get the last document
let last = snapshot.docs[snapshot.docs.length - 1];
// Construct a new query starting at this document.
let next = db
.collection("games")
.doc(req.params.docid)
.collection("players")
.orderBy("uid")
.startAfter(last.data())
.limit(500);

You could paginate your query with cursors to reduce the size of the result set to something more manageable, and keep paging forward until the collection is fully iterated.
Also, you will want to use get() instead of onSnapshot(), as an export process is probably not interested in receiving updates for any document in the set that might be added, changed, or deleted.

Related

How to use Promise.all with multiple Firestore queries

I know there are similar questions to this on stack overflow but thus far none have been able to help me get my code working.
I have a function that takes an id, and makes a call to firebase firestore to get all the documents in a "feedItems" collection. Each document contains two fields, a timestamp and a post ID. The function returns an array with each post object. This part of the code (getFeedItems below) works as expected.
The problem occurs in the next step. Once I have the array of post ID's, I then loop over the array and make a firestore query for each one, to get the actual post information. I know these queries are asynchronous, so I use Promise.all to wait for each promise to resolve before using the final array of post information.
However, I continue to receive "undefined" as a result of these looped queries. Why?
const useUpdateFeed = (uid) => {
const [feed, setFeed] = useState([]);
useEffect(() => {
// getFeedItems returns an array of postIDs, and works as expected
async function getFeedItems(uid) {
const docRef = firestore
.collection("feeds")
.doc(uid)
.collection("feedItems");
const doc = await docRef.get();
const feedItems = [];
doc.forEach((item) => {
feedItems.push({
...item.data(),
id: item.id,
});
});
return feedItems;
}
// getPosts is meant to take the array of post IDs, and return an array of the post objects
async function getPosts(items) {
console.log(items)
const promises = [];
items.forEach((item) => {
const promise = firestore.collection("posts").doc(item.id).get();
promises.push(promise);
});
const posts = [];
await Promise.all(promises).then((results) => {
results.forEach((result) => {
const post = result.data();
console.log(post); // this continues to log as "undefined". Why?
posts.push(post);
});
});
return posts;
}
(async () => {
if (uid) {
const feedItems = await getFeedItems(uid);
const posts = await getPosts(feedItems);
setFeed(posts);
}
})();
}, []);
return feed; // The final result is an array with a single "undefined" element
};
There are few things I have already verified on my own:
My firestore queries work as expected when done one at a time (so there are not any bugs with the query structures themselves).
This is a custom hook for React. I don't think my use of useState/useEffect is having any issue here, and I have tested the implementation of this hook with mock data.
EDIT: A console.log() of items was requested and has been added to the code snippet. I can confirm that the firestore documents that I am trying to access do exist, and have been successfully retrieved when called in individual queries (not in a loop).
Also, for simplicity the collection on Firestore currently only includes one post (with an ID of "ANkRFz2L7WQzA3ehcpDz", which can be seen in the console log output below.
EDIT TWO: To make the output clearer I have pasted it as an image below.
Turns out, this was human error. Looking at the console log output I realised there is a space in front of the document ID. Removing that on the backend made my code work.

Firebase Firestore: How to update or access and update a field value, in a map, in an array, in a document, that is in a collection

Sorry for the long title. Visually and more precise, I would like to update the stock value after a payment is made. However, I get stuck after querying the entire document (e.g. the selected one with title sneakers). Is there a way to actually query and update for example the Timberlands stock value to its value -1. Or do you have to get all data from the entire document. Then modify the desired part in javascript and update the entire document?
Here is a little snippet of a solution I came up with so far. However, this approach hurts my soul as it seems very inefficient.
const updateFirebaseStock = (orders) => {
orders.forEach( async (order) => {
try {
collRef = db.doc(`collections/${order.collectionid}`);
doc = await collRef.get();
data = doc.data();
//Here:const newItems = data.items.map(if it's corr name, update value, else just return object), results in desired new Array of objects.
//Then Update entire document by collRef.update({items: newItems})
} catch (error) {
console.error(error)
};
});
}
You don't need to get the document at all for that, all you have to do is use FieldValue.increment(), using your code as a starting point it could look like this:
collRef = db.doc(`collections/${order.collectionid}`);
collRef.update({
Price: firebase.firestore.FieldValue.increment(-1)
});
You can increment/decrement with any numeric value using that function.

How to (using React JS web) and Firestore, can you find out when a chatRoom (on the Firestore Database) receives new messages?

I am trying to build an app using FireStore and React JS (Web)
My Firestore database basically has:
A collection of ChatRooms ChatRooms
Every chat-room has many messages which is a subcollection, for example:
this.db.collection("ChatRooms").doc(phone-number-here).collection("messages")
Also, every chat-room has some client info like first-name, last-name etc, and one that's very important:
lastVisited which is a timestamp (or firestamp whatever)
I figured I would put a React Hook that updates every second the lastVisited field, which means to try to record as accurately as possible on Firestore the last time I left a chat-room.
Based on that, I want to retrieve all the messages for every customer (chat-room) that came in after the last visit,
=> lastVisited field. :)
And show a notification.
I have tried from .onSnapshot listener on the messages subcollection, and a combination of Firestore Transactions but I haven't been lucky. My app is buggy and it keeps showing two, then one, then nothing, back to two, etc, and I am suffering much.
Here's my code!
Please I appreciate ANY help!!!
unread_messages = currentUser => {
const chatRoomsQuery = this.db.collection("ChatRooms");
// const messagesQuery = this.db.collection("ChatRooms");
return chatRoomsQuery.get().then(snapshot => {
return snapshot.forEach(chatRoom => {
const mess = chatRoomsQuery
.doc(chatRoom.id)
.collection("messages")
.where("from", "==", chatRoom.id)
.orderBy("firestamp", "desc")
.limit(5);
// the limit of the messages could change to 10 on production
return mess.onSnapshot(snapshot => {
console.log("snapshot SIZE: ", snapshot.size);
return snapshot.forEach(message => {
// console.log(message.data());
const chatRef = this.db
.collection("ChatRooms")
.doc(message.data().from);
// run transaction
return this.db
.runTransaction(transaction => {
return transaction.get(chatRef).then(doc => {
// console.log("currentUser: ", currentUser);
// console.log("doc: ", doc.data());
if (!doc.exists) return;
if (
currentUser !== null &&
message.data().from === currentUser.phone
) {
// the update it
transaction.update(chatRef, {
unread_messages: []
});
}
// else
else if (
new Date(message.data().timestamp).getTime() >
new Date(doc.data().lastVisited).getTime()
) {
console.log("THIS IS/ARE THE ONES:", message.data());
// newMessages.push(message.data().customer_response);
// the update it
transaction.update(chatRef, {
unread_messages: Array.from(
new Set([
...doc.data().unread_messages,
message.data().customer_response
])
)
});
}
});
})
.then(function() {
console.log("Transaction successfully committed!");
})
.catch(function(error) {
console.log("Transaction failed: ", error);
});
});
});
});
});
};
Searching about it, it seems that the best option for you to achieve that comparison, would be to convert your timestamps in milliseconds, using the method toMillis(). This way, you should be able to compare the results better and easier - more information on the method can be found in the official documentation here - of the timestamps of last message and last access.
I believe this would be your best option as it's mentioned in this Community post here, that this would be the only solution for comparing timestamps on Firestore - there is a method called isEqual(), but it doesn't make sense for your use case.
I would recommend you to give it a try using this to compare the timestamps for your application. Besides that, there is another question from the Community - accessible here: How to compare firebase timestamps? - where the user has a similar use cases and purpose as yours, that I believe might help you with some ideas and thoughts as well.
Let me know if the information helped you!

Iterating cursor on big collection in mongo and node doesn't return all results?

I have a collection that has 500k documents (collection takes about
130mb)
I'm using the standard mongodb driver:
var mongodb = require('mongodb');
I'm trying to iterate through this collection in node.js, using a cursor. (because .toArray takes too long to put entire dataset in memory)
var cursor = db.collection('test').find({});
cursor.each(function(err, doc) {
// only does this 1000 times
});
I found that it only did it 1000 times, so I looked at the documentation https://mongodb.github.io/node-mongodb-native/api-generated/cursor.html and under the "each" section, it said to increase the batch size.
So I made an extremely large batch size, I didn't find a way to make it unlimited. If you know a way let me know.
var cursor = db.collection('test').find({}).batchSize(1000000000000);
cursor.each(function(err, doc) {
// only does this 30382 times
});
And increasing the batch size any more doesn't make it iterate on more elements then 30382.
How can I make cursor.each() iterate 500,000 times?
You can track the index and on error you can continue from where you left again:
const iterateCollection = (skip) => {
const cursor = db.collection('test').find({}).skip(skip);
cursor.each(function(err, doc) {
skip++;
if(err){
//if err due to overflow
iterateCollection (skip)
}
});
};
iterateCollection(0);
I managed to solve this with using "forEach" instead of "each"... I have no idea what the difference is, all I know it works so far. so
var cursor = db.collection('test').find();
cursor.forEach(function(doc) {
// do stuff, does it 500,000 times for my collection...
}, function(err) {
// finished
db.close();
});
Only problem now is forEach is slow as molasses in january, so would be interested in hearing other solutions.

Node js - Request Error transaction was deadlocked

I'm having problems when i insert several data using promise, sometimes it works but other times give me this error:
And my code is this:
return Promise.all([
Promise.all(createBistamp),
Promise.all(createSlstamp),
listOfResults,
i
]).then(function(listOfResults2) {
for(var j=0; j<resultArticle.length; j++) {
if(arm === 'Arm-1') {
}
if(arm === 'Arm-1-11') {
}
}
if(arm === 'Arm-1') {
console.log("PROMISE ARM-1");
return Promise.all([insertBi,insertBi2,insertSl]).then(function (insertEnd) {
res.send("true");
}).catch(function(err) {
console.log(err);
});
}
if(arm === 'Arm-1-11') {
console.log("PROMISE ARM-1-11");
return Promise.all([insertBi,insertBi2,insertSl,insertSlSaida]).then(function (insertEnd) {
res.send("true");
}).catch(function(err) {
console.log(err);
});
}
}).catch(function(err) {
console.log(err);
});
I remove the code line inside ifs and for but it was inserts in database.
Example of insert:
var insertBi2 = request.query("INSERT INTO bi2 (bi2stamp,alvstamp1,identificacao1,szzstamp1,zona1,bostamp,ousrinis,ousrdata,ousrhora,usrinis,usrdata,usrhora)"+
"VALUES ('"+bistamp+"','AB16083056009,454383576','2','Adm13010764745,450449475','1','"+bostamp+"','WWW','"+data+"','"+time+"','WWW','"+data+"','"+time+"')");
Full Code:
http://pastebin.com/DTjtXvDt
This is my structure and i don't know if i'm working well with promises.
Thank you
I have also faced this problem recently.
error: RequestError: Transaction (Process ID 72) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Solution -
There was not a single index on the table. So, I created a non-clustered unique index on the unique identifier column.
I was surprised when this solution worked
There was a single update operation in the code and no select operation. So, it made me curious to do some research. I came across lock granularity mechanism for locking resources. In my case, locking has to be at row level instead of page level.
Note:
For clustered tables, the data pages are stored at the leaf level of the (clustered) index structure and are therefore locked with index key locks instead of row locks.
Further Reading
https://www.sqlshack.com/locking-sql-server/
If you are inserting data or updating data in a loop, then it's better to make all queries in the loop and store it and then execute it all at once in a single transaction. Will save yourself with a lot of issues

Categories

Resources