Which is better in MongoDB: multiple indexes or multiple collections

Which is better in MongoDB: multiple indexes or multiple collections - javascript

I'm working on an application that authenticates users using 3rd party services (Facebook, Google, etc.). I give each user an internal id (uuid v4) which is associated with their 3rd party ids. Right now, my (mongoose) user document model looks something like this:
var user = new mongoose.Schema({
uuid: {type: String, required: true, unique: true, index: true, alias: 'userId'},
fbid: {type: String, required: false, index: true, alias: 'facebookId'},
gid: {type: String, required: false, index: true, alias: 'googleId'}
});
Because I can query on any IDs, I need indexes on all of them. I'm thinking that this can become an issue with a large amount of users (or if I add more 3rd party logins (Twitter, LinkedIn, etc.). Now, my question is whether this is the correct way to do this, or if there is a better solution.
One idea I had is having multiple collections, one per ID type. Something like this:
var user = new mongoose.Schema({
uuid: {type: String, required: true, unique: true, index: true, alias: 'userId'},
});
var facebookUser = new mongoose.Schema({
fbid: {type: String, required: false, index: true, alias: 'facebookId'},
userId: {type: Schema.Types.ObjectId, ref: 'user'}
});
This has the advantage of not cluttering the user model and easier sharding, however it means more queries to retrieve a user and even more to create a new user (1. check in facebookUser collection if a user exists, if not, create a new user, save it, then create a new facebookUser with a link towards that new user and then save that).
Which way is "better" (scales well, handles load, etc.)?

The main thing to consider with indexes is that they will fix in memory. Whether you have three indexes in one collection or three collections with one index is irrelevant (as far as the index is concerned). I would lean towards putting them all into one collection for ease of use.

Related

MongoDB: How can populate reference, and delete element in array after based on the ID of the reference

So I have a situation where I need to delete elements in an array of reference / ObjectIds, but the delete condition will be based on a field in the reference.
For example, I have schemas like the following:
const UserSchema = new mongoose.Schema({
firstName: String,
lastName: String,
homeFeeds:[{type: Schema.Types.ObjectId, requried: true, ref: "Activity"}];
}); // User , is the referenece name
const ActivitySchema = new mongoose.Schema({
requester: {type: Schema.Types.ObjectId, requried: true, ref: "User"},
message: String,
recipient: {type: Schema.Types.ObjectId, requried: true, ref: "User"},
}) // Activity, is the reference name
Now I need to delete some of the homeFeeds for a user, and the ones that should be deleted need to be by certain requester. That'll require the homeFeeds (array of 'Activity's) field to be populated first, and then update it with the $pull operator, with a condition that the Activity requester matches a certain user.
I do not want to read the data first and do the filtering in Nodejs/backend code, since the array can be very long.
Ideally I need something like:
await User.find({_id: ID})
.populate("homeFeeds", "requester")
.updateMany({
$pull: {
homeFeeds.requester: ID
}
});
But it does not work, Id really appreciate if anyone can help me out with this?
Thanks

MongoDB doesn't support $lookup in update as of version v6.0.1.
MongoServerError: $lookup is not allowed to be used within an update.
Though, this doesn't have to do with Mongoose's populate as populate doesn't depend on $lookup and fires additional queries to get the results. Have a look at here. Therefore, even if, you could achieve what you intend, that is avoiding fetching a large array on nodejs/backend, using mongoose will do the same thing for you behind the scenes which defeats your purpose.
However you should raise an issue at Mongoose's official github page and expect a response.

Biased Random w/ MongoDB and Javascript

Currently I am making a system where users submit images, they get put into a database by the use of a Schema and then users can use a command (through discord/discordjs) to pull a random biased/weighted image (and therefore document) from mongoDB and have it sent to them with options to vote on the image or report it.
That is the idea ^, Here is where I am so far:
.
Users can Submit images using a command through discord and it works.
The MongoDB document is made with these values:
const imageSchema = new Schema({
imageId: { type: Number, required: true, index: { unique: true } },
imageLocation: reqString,
votes: {type: Number, required: false, default: 0},
verified: {type: Number, required: false, default: 0},})
Image of what it looks like in mongodb compass for more context
What I am completely stumped on is how to scan all the documents and create a way to pick out one of the images randomly (not THE most highest voted, just have higher number of votes on a document make it show up more), that are most highly voted using JavaScript.
Any suggestions on this would be great, code snippets explaining concepts would be good too.

You can use the aggregation pipeline with the $sample operator.
Something like this should do the trick:
imageSchema.aggregate([
{
$sample: {
size: 1 // <- how many random documents you want
}
}
])

If I want certain workflow governance for different users, do I have to incorporate that in my data model?

I'm new to coding and am running into an issue that conceptually has my brain in a pretzel. I'm going to try my best to explain it and will link to my github as well, but here it is.. The music app I'm building will have three different user "classifications" or ("Class" in my code):
Artists
Venue owners
Fans
The intention is that when a user signs up they will select a persona (similar to Bandcamp), which then determines their functionality.
Here's the workflow:
Everyone is a "user"
Users either
a) registers their venue, or
b) registers their band, or
c) signs up as a fan
Then, venues can create "events" for bands to play but bands cannot create events. Users can browse both venues and bands to see which events they've hosted/played. Fans can attend events.
I can figure out the controls with authorization but I want to make sure I'm setting up my data model correctly -- and this is where I get confused.
Using my artist model as an example, if an artist is also a user, how do I incorporate the user into my artist model? Would adding a "user_id" schema below create a circular reference?
const mongoose = require('mongoose');
const Schema = mongoose.Schema;
const artistSchema = new Schema({
name: {
type: String,
required: [true, 'Artist must have a name']
},
genre: {
type: String
},
email: {
type: String,
required: [true, 'Contact email required']
},
location: {
type: String,
required: [true, 'Hometown (so you can be paired with local venues)']
},
})
module.exports = mongoose.model('Artist', artistSchema);
Here's a link to my github if you need more context on the four models. Thank you!!

MongoDB best practise - One to many relation

According to this post I should embed a "reference": MongoDB relationships: embed or reference?
User.js
const mongoose = require('mongoose');
const Schema = mongoose.Schema;
const userSchema = new Schema({
email: {
type: String,
required: true
},
password: {
type: String,
required: true
},
createdEvents: ['Event']
});
module.exports = mongoose.model('User', userSchema);
Event.js
const mongoose = require('mongoose');
const Schema = mongoose.Schema;
const eventSchema = new Schema({
title: {
type: String,
required: true
},
description: {
type: String,
required: true
},
price: {
type: Number,
required: true
},
date: {
type: Date,
required: true
}
});
module.exports = mongoose.model('Event', eventSchema);
So an embedded event looks like this in the database:
My Code works but im curious if this is the right way to embed the event. Because every example for one to many relations is made with references and not embedded.

From my experience, using embedding or referencing depends on how much data we are dealing with.
When deciding on which approach to pick, You should always consider:
1- One-To-Few: if you'll have a small number of events added to a user over time, I recommend you to pick the embedding approach as it is simpler to deal with.
2- One-To-Many: if new events are frequently added, you totally should go for referencing to avoid future performance issues.
Why?
When you are frequently adding new events, if they are being embedded inside an user document, that document will grow larger and larger over time. In the future you will probably face issues like I/O overhead. You can catch a glimpse of the evils of large arrays in the article Why shouldn't I embed large arrays in my documents?. Although it's hard to find it written anywhere, large arrays in MongoDB are considered a performance anti-pattern.
If you decide to go for referencing, I suggest reading Building with Patterns: The Bucket Pattern. The article can give you an idea on how to design your user_events collection in a non-relational way.

Understanding Many-to-Many relationships in MongoDB and how to dereference collections

I've spent some time researching MongoDB alternatives for implementing a many-to-many relationships including several stackoverflow articles (here and here) and these slides.
I am creating an app using the MEAN stack and I'm trying to get confirmation on my schema setup and best practices in dereferencing a collection of objects.
I have a basic many-to-many relationship between users and meetings (think scheduling meetings for users where a user can be in many meetings and a meeting contains several users).
Given my use case I think it's best that I use referencing rather than embedding. I believe (from what I've read) that it would be better to use embedding only if my meetings had users unique to a single meeting. In my case these same users are shared across meetings. Also, although updating users would be infrequent (e.g., change username, password) I still feel that using a reference feels right - although I'm open to opinions.
Assuming I went with references I have the following (simplified) schema:
var MeetingSchema = new Schema({
description: {
type: String,
default: '',
required: 'Please fill in a description for the meeting',
trim: true
},
location: {
type: String,
default: '',
required: 'Please fill in a location for the meeting',
trim: true
},
users: [ {
type: Schema.ObjectId,
ref: 'User'
} ]
});
var UserSchema = new Schema({
firstName: {
type: String,
trim: true,
default: '',
validate: [validateLocalStrategyProperty, 'Please fill in your first name']
},
lastName: {
type: String,
trim: true,
default: '',
validate: [validateLocalStrategyProperty, 'Please fill in your last name']
},
email: {
type: String,
trim: true,
default: '',
validate: [validateLocalStrategyProperty, 'Please fill in your email'],
match: [/.+\#.+\..+/, 'Please fill a valid email address']
},
username: {
type: String,
unique: true,
required: 'Please fill in a username',
trim: true
},
password: {
type: String,
default: '',
validate: [validateLocalStrategyPassword, 'Password should be longer']
}
});
First, you will notice that I don't have a collection of meetings in users. I decided not to add this collection because I believe I could use the power of a MongoDB find to obtain all meetings associated with a specific user - i.e.,
db.meetings.find({users:ObjectId('x123')});
Of course I would need to add some indexes.
Now if I'm looking to deference my users for a specific meeting, how do I do that? For those who understand rails and know the different between :include and :join I'm looking for a similar concept. I understand we are not dealing with joins in MongoDB, but for me in order to dereference the users collection from the meeting to get a user's first and last name I would need to cycle through the collection of id's and perform some sort of a db.users.find() for each id. I assume there's some easy MongoDB call I can make to get this to occur in a performant way.

For a discussion of schema design in MongoDB, covering exactly this topic, I refer you to these postings on the MongoDB blog:
Part 1
Part 2
Part 3
In particular, look at the sample JavaScript code showing you how to do the application-level joins.

Develop Reference

JavaScript is the programming language of the Web.

Which is better in MongoDB: multiple indexes or multiple collections - javascript

The main thing to consider with indexes is that they will fix in memory. Whether you have three indexes in one collection or three collections with one index is irrelevant (as far as the index is concerned). I would lean towards putting them all into one collection for ease of use.

Related

MongoDB: How can populate reference, and delete element in array after based on the ID of the reference

Biased Random w/ MongoDB and Javascript

If I want certain workflow governance for different users, do I have to incorporate that in my data model?

MongoDB best practise - One to many relation

Understanding Many-to-Many relationships in MongoDB and how to dereference collections

Categories

Resources