Firestore get docs based off value existing in array - javascript

I am facing a little bit of a mental block in terms of how to do some relational queries with firestore while adhering to the best practices. I am creating a feed feature where you can see a feed of posts from your friends. Essentially my data structure is as follows:
Friends (collection)
-friend_doc
...data
friends_uid: [uid1, uid2]
Posts (collection)
-post_doc
...data
posted_by: uid2
Basically I am making a query to get all of the friends where the friends_uid contains my uid (uid1 in this case). And then once I mapped all of the friends uid's to an array, I want to make a firestore query to get posts where the posted_by field is equal to any of the uid's in that array of friends uid's. I haven't been able to make something that does anything like that yet.
I know that it seems most convenient to loop through the string array of friends uid's and make a query for each one like:
listOfUids.forEach(async (item) => {
const postQuerySnapshot = await firestore()
.collection('posts')
.where('uid', '==', item)
.get();
results.push(postQuerySnapshot.docs);
});
but this is extremely problematic for paging and limiting data as I could possibly receive tons of posts. I may just be too deep into this code and missing an obvious solution or maybe my data structure is somewhat flawed. Any insight would be greatly appreciated.
TLDR - how can I make a firestore query that gets all docs that have a value that exists in an array of strings?

You can use an "in" query for this:
firestore()
.collection('posts')
.where('uid', 'in', [uid1, uid2, ...])
But you are limited to 10 elements in that array. So you are probably going to have to stick to what you have now. You will not be able to use Firestore's pagination API.
Your only real alternatives for this case is to create a new collection that contains all of the data you want to query in one place, as there are no real join operations. Duplicating data like this is common for nosql type databases.

Related

How to implement pagination in a merged set of queries when implementing a logical OR

In the Firestore documentation, it states clearly the limitations of support for query filters with logical OR.
For example:
const userPostsQuery = query(postsRef, where("author", "==", uid);
const publicPostsQuery = query(postsRef, where("public", "==", true);
If as in the above example, we need to get a list of both, user posts and public posts all sorted together by date, ie: Both queries need to be OR-ed together, such a feature is not available in Firestore and we will have to run both queries separately, and then merge and sort the results on the client-side.
I'm fine with such a sad workaround. but what if the total number of posts can be huge? thus we need to implement a pagination system where each page shows 50 posts max. How can this be done with such a sad workaround?
Firestore has very limited operators and aggregation options. However, it has limited OR support with an Array type.
A solution that could simplify your use case is to introduce a new field of type array in your post document. Let's say this field is named a. When you create your document, a is equal to [authorId, 'public'] if the post is public, [authorId] otherwise.
Then, you can query your need using the array-contains-any operator:
const q = query(postRef, where('a', 'array-contains-any', [authorId, 'public']));
You can easily add pagination with limit, orderBy, startAt, and startAfter functions.

How to limit collectionGroup query results to an ancestor?

I've been reading the blog post https://firebase.googleblog.com/2019/06/understanding-collection-group-queries.html to better understand the collectionGroup queries.
Although, I still have one question: how can I limit the results to a specific ancestor. Let me explain myself.
Imagine I have companies that manufacture cars that have tyres. We have different brands of tyres, used in different cars. In the end, we have a many-to-many relationship. I know I should not use this term in the NoSQL world, but I call a dog a dog :-)
Anyway, my question is the following: If we have a shortage in a company A of a specific tyre brand (let's say Michelin), you would need to flag this tyre as out of stock. I would think to run a collectionGroup query such as:
db.collectionQuery("tyre")
.where("brand", "==", "Michelin")
.get()
.then(function (querySnapshot) {
// update flag accordingly
})
But that would update the stock of other companies.
My question is: how would you narrow the collectionGroup query results so you only update the tyres info from company A?
I could include the company A docRef in the tyres collection and use where() to narrow the results. It seems like a valid approach. Although, it would be a mix between a top-level collection and a subcollection. Is it best practice?
UPDATE
Actually, I'm following the example of the restaurants to put my hands on firebase/firestore. A restaurant can have multiple menus. A menu can have multiple items. Items can be reused and therefore present in multiple menus.
collection('restaurants').doc(..).collection('menus').doc(..).collection('items')
I like to think that's the best way to structure the data (vs. a top-level collection for the items). But items like Coffee can easily be found in multiple menus of multiple restaurants. If one restaurant is short on coffee, how can I update the coffee items for that specific restaurant using something like:
db.collectionQuery("items")
.where("name", "==", "Coffee")
.get()
.then(function (querySnapshot) {
// set available = false
})
If one restaurant is short on coffee, how can I update the coffee
items for that specific restaurant?
By using a collectionGroup query you could do like that:
db.collectionQuery('items')
.where('name', '==', 'Coffee')
.get()
.then(function (querySnapshot) {
querySnapshot.forEach(function (doc) {
const itemQuantity = doc.data().itemQuantity;
if (itemQuantity === 0) {
const restaurantRef = doc.ref.parent.parent.parent.parent;
return restaurantRef.update( {....})
}
});
});
by alternatively using the parent properties of DocumentReference and CollectionReference.
However, this may not be the most efficient and affordable way if you have a lot of restaurants, because your collectionGroup query will return a lot of records.
A more efficient way would be to keep a set of counters and watch them, through either Firestore listeners or Cloud Functions.
Finally, note an important point: you write "A menu can have multiple items. Items can be reused and therefore present in multiple menus". Note that items documents in
collection('restaurants').doc('r1').collection('menus').doc('m1').collection('items')
and in
collection('restaurants').doc('r1').collection('menus').doc('m2').collection('items')
are totally different documents. This is different from the SQL world where different records from one table can point to the same record of another table.
Conclusion: You should most probably have one itemsStock collection per restaurant, and each time one of the items is "consumed/ordered" you decrease its count by using FieldValue.increment(-1).
In other words, I advise to separate the collections of items that compose a menu from the one which holds the items counters (i.e. the itemsStock collection). The first ones are dedicated to menus items selection and the second one dedicated to managing the stock of the restaurant. When a guest/customer chooses/orders an item you only decrease the collection holding the items counters.
Update following your comment:
If you want to update all the "lasagna" items in all the menus of a restaurant (for example to add an ingredient, as you mentioned in your comment), a very common approach is indeed to modify all the corresponding docs (this is called data duplication in the NoSQL world).
You would use the exact code at the top of my answer: you query all the "lasagna" items documents in all the menus of the restaurant and update them. You could trigger this process by a Cloud Function that would "watch" a master collection in which you have reference items: each time you change a doc of this collection (i.e. an item) you update all the similar/corresponding items doc in the menus subcollections.
I could include the company A docRef in the tyres collection and use where() to narrow the results. It seems like a valid approach. Although, it would be a mix between a top-level collection and a subcollection. Is it best practice?
This is a common approach, since the only way to filter documents in a collection group query is using the fields of the documents. You can't use anything in the path of the document as a filter. It's common to duplicate data in NoSQL type databases in order to facilitate queries.
However, you probably don't want to have a top-level collection with the same name as child collections, if you want to limit the queries to just the child collections.

Querying for object key in Firestore

I currently have a few issues with my Firestore querying technique. As per this stackoverflow post I made recently, Querying with two array with firestore security rules
The answer proposed to add the the "ids" into a object, with the key as the id, and the value simply being "true". I have completed this, and now my structure looks like so:
This leaves me with this query:
db.collection('Depots')
.where(`products.${productId}`, '==', true)
.where(`users.${userId}`, '==', true)
.where('created', '>', 1585998560500)
.orderBy('created', 'asc')
.get();
This query leaves me with throwing an error, asking to create an index:
The query requires an index. You can create it here: ...
However, this tries to index the specific object key, i.e. QXooVYGBIFWKo6C so products.QXooVYGBIFWKo6C. Which is certianly not what I want, as this query changes, and can have an infinite number of possibilities, which means I would have to create another index for each key entry in order to query it.
Is there any way to solve this issue? I am assuming it needs to index this query due to the different operators used in the query, so I was wondering if there were any workarounds to this issue.
Thank you very much in advance.
What you have here is a map field, for which indexes should usually be created automatically.
That indeed means that you'll have as many indexes as you have products, which means:
You are limited in how many products you can have, as there is a maximum of 40,000 index entries per document.
You pay more per document, as you pay for the storage of each index.
If these are not what you want, you'll have to switch back to your original model, with the query limitations you had there. There doesn't seem to be a solution that fits both of your requirements.
After our discussion in chat, this is the starting point I would suggest. Who knows what the end architecture would look like, but I think this or very close to this. You say that a user can exist in multiple depots at the same time and multiple depots can contain the same products, also at the same time. You also said that a depot can never have more than 40 users at a given time, so an array of 40 users would certainly not encroach on Firestore's document limit of 1,048,576 bytes.
[collection]
<documentId>
- field: value
[depots]
<UUID>
- depotId: string "depot456"
- productCount: num 5,000
<UUID>
- depotId: string "depot789"
- productCount: num 4,500
[products]
<UUID>
- productId: string "lotion123"
- depotId: string "depot456"
- users: [string] ["user10", "user27", "user33"]
<UUID>
- productId: string "lotion123"
- depotId: string "depot789"
- users: [string] ["user10", "user17", "user50"]
[users]
<userId>
- depots: [string] ["depot456", "depot999"]
<userId>
- depots: [string] ["depot333", "depot999"]
In NoSQL, storage is cheap and computation isn't so denormalize your data as much as you need to make your queries possible and efficient (fast and cheap).
To find all depots in a single query where user10 and lotion123 are both true, query the products collection where productId equals x and users array-contains y and collect the depotId values from those results. If you want to preserve the array-contains operation for something else, you'd have to denormalize your data further (replace the array for a single user). Or you could split this query into two separate queries.
With this model, when a user leaves a depot, get all products where users array-contains that user and remove that userId from the array. And when a user joins a depot, get all products where depotId equals x and append that userId to the array.
Watch this video, and others by Rick, to get a solid handle on NoSQL: https://www.youtube.com/watch?v=HaEPXoXVf2k
#danwillm If you are not sure about the number of users and products then your DB structure seems unfit for this situation because there are size and length limitations of the firestore document.
You should rather create a separate collection for products and users i.e normalize your data and have a reference for the user in the product collection.
User :
{
userId: documentId,
name: John,
...otherInfo
}
Product :
{
productId: documentId,
createdBy: userId,
createdOn:date,
productName:"exa",
...otherInfo
}
This way you there will be the size of the document would be limited, i.e try avoiding using maps/arrays in firestore if you are not sure about there size.
Also, in this case, the number of queries would be increased but you don't need many indexes in this case.

Meteor MongoDB Filter Parent Records by Child Fields

How would I go about filtering a set of records based on their child records.
Let's say I have a collection Item that has a field to another collection Bag called bagId. I'd like to find all Items where a field on Bags matches some clause.
I.e. db.Items.find( { "where bag.type:'Paper' " }) . How would I go about doing this in MongoDB. I understand I'd have to join on Bags and then link where Item.bagId == Bag._id
I used Studio3T to convert a SQL GROUP BY to a Mongo aggregate. I'm just wondering if there's any defacto way to do this.
Should I perform a data migration to simply include Bag.type on every Item document (don't want to get into the habit of continuously making schema changes everytime I want to sort/filter Items by Bag fields).
Use something like https://github.com/meteorhacks/meteor-aggregate (No luck with that syntax yet)
Grapher https://github.com/cult-of-coders/grapher I played around with this briefly and while it's cool I'm not sure if it'll actually solve my problem. I can use it to add Bag.type to every Item returned, but I don't see how that could help me filter every item by Bag.type.
Is this just one of the tradeoffs of using a NoSQL dbms? What option above is recommended or are there any other ideas?
Thanks
You could use the $in functionality of MongoDB. It would look something like this:
const bagsIds = Bags.find({type: 'paper'}, {fields: {"_id": 1}}).map(function(bag) { return bag._id; });
const items = Items.find( { bagId: { $in: bagsIds } } ).fetch();
It would take some testing if the reactivity of this solution is still how you expect it to work and if this would still be suitable for larger collections instead of going for your first solution and performing the migration.

Is MongooseJS Populate an Anti Pattern?

I’ve found multiple posts and guides praising the ability to do joins in Mongoose and MongoDB using the populate() method.
This makes me confused. If you want to do joins, shouldn’t you use an SQL database? Shouldn’t joins in MongoDB be a last resort?
Each object that is using populate() is required to do a second query to fetch that data. So if you fetch 100 items in a query, you need to do another 100 queries to fetch that data. It sounds like storing it as nested schemes is a way better idea where possible.
An I wrong? Is populate() actually a great method that make sense? Or am I right that it’s a last resort option that you can use in cases that should be avoided?
populate() doesn't send a find request for every child document per parent document.
it sends a single find with all child ObjectIds (of all parents!) in the filter.
example (mongoose.set('debug', true) console output):
Mongoose: parent.find({}, { fields: {} }) // was called with populate()
Mongoose: child.find({ _id: { '$in': [ ObjectId(A), ObjectId(B), ...] }})
and then probably "joins" parents to children in node.
so essentially, only 1 RTT was added. to avoid this as much as possible, I've denormalized some of my schemas for common use cases.

Categories

Resources