Biased Random w/ MongoDB and Javascript - javascript

Currently I am making a system where users submit images, they get put into a database by the use of a Schema and then users can use a command (through discord/discordjs) to pull a random biased/weighted image (and therefore document) from mongoDB and have it sent to them with options to vote on the image or report it.
That is the idea ^, Here is where I am so far:
.
Users can Submit images using a command through discord and it works.
The MongoDB document is made with these values:
const imageSchema = new Schema({
imageId: { type: Number, required: true, index: { unique: true } },
imageLocation: reqString,
votes: {type: Number, required: false, default: 0},
verified: {type: Number, required: false, default: 0},})
Image of what it looks like in mongodb compass for more context
What I am completely stumped on is how to scan all the documents and create a way to pick out one of the images randomly (not THE most highest voted, just have higher number of votes on a document make it show up more), that are most highly voted using JavaScript.
Any suggestions on this would be great, code snippets explaining concepts would be good too.

You can use the aggregation pipeline with the $sample operator.
Something like this should do the trick:
imageSchema.aggregate([
{
$sample: {
size: 1 // <- how many random documents you want
}
}
])

Related

MongoDB: How can populate reference, and delete element in array after based on the ID of the reference

So I have a situation where I need to delete elements in an array of reference / ObjectIds, but the delete condition will be based on a field in the reference.
For example, I have schemas like the following:
const UserSchema = new mongoose.Schema({
firstName: String,
lastName: String,
homeFeeds:[{type: Schema.Types.ObjectId, requried: true, ref: "Activity"}];
}); // User , is the referenece name
const ActivitySchema = new mongoose.Schema({
requester: {type: Schema.Types.ObjectId, requried: true, ref: "User"},
message: String,
recipient: {type: Schema.Types.ObjectId, requried: true, ref: "User"},
}) // Activity, is the reference name
Now I need to delete some of the homeFeeds for a user, and the ones that should be deleted need to be by certain requester. That'll require the homeFeeds (array of 'Activity's) field to be populated first, and then update it with the $pull operator, with a condition that the Activity requester matches a certain user.
I do not want to read the data first and do the filtering in Nodejs/backend code, since the array can be very long.
Ideally I need something like:
await User.find({_id: ID})
.populate("homeFeeds", "requester")
.updateMany({
$pull: {
homeFeeds.requester: ID
}
});
But it does not work, Id really appreciate if anyone can help me out with this?
Thanks
MongoDB doesn't support $lookup in update as of version v6.0.1.
MongoServerError: $lookup is not allowed to be used within an update.
Though, this doesn't have to do with Mongoose's populate as populate doesn't depend on $lookup and fires additional queries to get the results. Have a look at here. Therefore, even if, you could achieve what you intend, that is avoiding fetching a large array on nodejs/backend, using mongoose will do the same thing for you behind the scenes which defeats your purpose.
However you should raise an issue at Mongoose's official github page and expect a response.

Mongoose, $pull an element from nested array and update document based on the presence of the element

I am working on a mongoose schema similar to this:
const actionSchema = {
actions: {
type: [{
actionName: {
type: String,
required: true
},
count: {
type: Number,
default: 0,
required: true
},
users: [{
type: Schema.Types.ObjectId,
ref: 'User'
}]
}]
}};
It is a nested schema of a post schema.
Here, actions are dynamically generated and number of people does that action are maintained by count and their identity is maintained by users array.
As you see, actions is an array of objects which further contain users array.
I want to check if a provided user id is present in any of the action object and then remove it from array and also reduce the count.
Being totally new to mongoose and mongodb, one simple way I see is to find the post using Post.findById() which has to be updated, run js loops, update the post and call .save(). But it can be very costly when users array has thousands of user ids.
I tried .update() but can't understand how to use it in this case.
How about adding a method to the Post Model (like postSchema.methods.removeUserAction)? This gives access to document from this and allows to update the document and thus call .save(). Does it loads the full document to the client node application?
So please suggest the right way.
Thank you.
You should simplify your model, for example
// Model - Actions Model
const actionSchema = {
actionName: {
type: String,
required: true
},
user: {
type: Schema.Types.ObjectId,
ref: 'User'
}
};
And you can easily get the total actions via Model.count(), get specific action count with Model.count({ actionName: 'action name'}), and removing entries with Model.delete(condition). Unless there's a reason why you have it modeled this way.

MongoDB best practise - One to many relation

According to this post I should embed a "reference": MongoDB relationships: embed or reference?
User.js
const mongoose = require('mongoose');
const Schema = mongoose.Schema;
const userSchema = new Schema({
email: {
type: String,
required: true
},
password: {
type: String,
required: true
},
createdEvents: ['Event']
});
module.exports = mongoose.model('User', userSchema);
Event.js
const mongoose = require('mongoose');
const Schema = mongoose.Schema;
const eventSchema = new Schema({
title: {
type: String,
required: true
},
description: {
type: String,
required: true
},
price: {
type: Number,
required: true
},
date: {
type: Date,
required: true
}
});
module.exports = mongoose.model('Event', eventSchema);
So an embedded event looks like this in the database:
My Code works but im curious if this is the right way to embed the event. Because every example for one to many relations is made with references and not embedded.
From my experience, using embedding or referencing depends on how much data we are dealing with.
When deciding on which approach to pick, You should always consider:
1- One-To-Few: if you'll have a small number of events added to a user over time, I recommend you to pick the embedding approach as it is simpler to deal with.
2- One-To-Many: if new events are frequently added, you totally should go for referencing to avoid future performance issues.
Why?
When you are frequently adding new events, if they are being embedded inside an user document, that document will grow larger and larger over time. In the future you will probably face issues like I/O overhead. You can catch a glimpse of the evils of large arrays in the article Why shouldn't I embed large arrays in my documents?. Although it's hard to find it written anywhere, large arrays in MongoDB are considered a performance anti-pattern.
If you decide to go for referencing, I suggest reading Building with Patterns: The Bucket Pattern. The article can give you an idea on how to design your user_events collection in a non-relational way.

Which is better in MongoDB: multiple indexes or multiple collections

I'm working on an application that authenticates users using 3rd party services (Facebook, Google, etc.). I give each user an internal id (uuid v4) which is associated with their 3rd party ids. Right now, my (mongoose) user document model looks something like this:
var user = new mongoose.Schema({
uuid: {type: String, required: true, unique: true, index: true, alias: 'userId'},
fbid: {type: String, required: false, index: true, alias: 'facebookId'},
gid: {type: String, required: false, index: true, alias: 'googleId'}
});
Because I can query on any IDs, I need indexes on all of them. I'm thinking that this can become an issue with a large amount of users (or if I add more 3rd party logins (Twitter, LinkedIn, etc.). Now, my question is whether this is the correct way to do this, or if there is a better solution.
One idea I had is having multiple collections, one per ID type. Something like this:
var user = new mongoose.Schema({
uuid: {type: String, required: true, unique: true, index: true, alias: 'userId'},
});
var facebookUser = new mongoose.Schema({
fbid: {type: String, required: false, index: true, alias: 'facebookId'},
userId: {type: Schema.Types.ObjectId, ref: 'user'}
});
This has the advantage of not cluttering the user model and easier sharding, however it means more queries to retrieve a user and even more to create a new user (1. check in facebookUser collection if a user exists, if not, create a new user, save it, then create a new facebookUser with a link towards that new user and then save that).
Which way is "better" (scales well, handles load, etc.)?
The main thing to consider with indexes is that they will fix in memory. Whether you have three indexes in one collection or three collections with one index is irrelevant (as far as the index is concerned). I would lean towards putting them all into one collection for ease of use.

Use Date as the ID in MongoDB?

Intending to log use of the API (user/route/params/time) in a Heroku/Node/Express/Mongodb web app, to allow various analytics (who/what/when/how often). One way I can think of is to push those to MongoDB.
Mongo will generate an ID automatically, and I see that it's possible to extract the created time from the autogenerated ID, but since the time stamp is all I want, now I wonder if I can use the date as the ID?
This seems to work, and the timestamps seem granular enough ("_id" : ISODate("2012-11-30T21:18:24.484Z")) that they'll be unique. Is this okay or just asking for an "ID not unique" error just when things get going?
var apilogSchema = new mongoose.Schema({
_id: {type: Date, default: Date.now},
userId: {type: mongoose.Schema.Types.ObjectId, required: false},
route: {type: String, required: false}
})
Get date and time from mongodb document _id field
what johnny said, wouldnt recommend that. especially when you are using it for logging user actions. (it is possible that 2 users do an action at the same millisecond, and now your id isnt unique anymore)
check out node-uuid if you are on node.js and want your id to contain a timestamp

Categories

Resources