MondoDB/Mongoose query responce is too slow - javascript

I new to MongoDB/Mongoose, and work with a very large database (more than 25000 docs). I need to configure different queries: by fields, first 10 docs, one by id. The problem is with performance - the server responce is too slow (about 10-15 seconds).
Please tell me how to configure this so that the server response is fast?
Does it depend only on the schema settings, or it can also depend on other things, such as database connection parameters, or query parameters?
P.S. Queries should be by 'district' and 'locality'.
Thanks for any help!
Here is the schema:
const mongoose = require('mongoose');
const Schema = mongoose.Schema;
const houseSchema = new Schema({
code: {
type: String,
required: false
},
name: {
type: String,
required: true
},
district: {
type: String,
required: true
},
locality: {
type: String,
required: false
},
recountDate: {
type: Date,
default: Date.now
},
eventDate: {
type: Date,
default: Date.now
},
events: {
type: Array,
default: []
}
});
module.exports = mongoose.model('House', houseSchema);
Connection parameters:
mongoose.connect(
`mongodb+srv://${process.env.MONGO_USER}:${process.env.MONGO_PASSWORD}#cluster0-vuauc.mongodb.net/${process.env.MONGO_DB}?retryWrites=true&w=majority`,
{
useNewUrlParser: true,
useUnifiedTopology: true
}
).then(() => {
console.log('Connection to database established...')
app.listen(5555);
}).catch(err => {
console.log(err);
});
Queries are performed using Relay:
query {
viewer {
allPosts (first: 10) {
edges {
node {
id
code
district
locality
recountDate
eventDate
events
}
}
}
}
}

MongoDB is very fast in the execution of queries. But it also depends on how you write your query. For getting the first 10 documents and sort it descending order to the _id from a collection. You need to use limit & sort in your query.
db.collectionName.find({}).limit(10).sort({_id:-1})

Make sure it's not a connection issue. Try to run your query from MongoDB shell
mongo mongodb+srv://${process.env.MONGO_USER}:${process.env.MONGO_PASSWORD}#cluster0-vuauc.mongodb.net/${process.env.MONGO_DB}?retryWrites=true&w=majority
db.collection.find({condition}).limit(10)
If in MongoDB shell it responds faster than Mongoose:
There is an issue for Node.js driver which uses pure Javascript BSON serializer which is very slow to serialize from BSON to JSON.
Try to install bson-ext
The bson-ext module is an alternative BSON parser that is written in C++. It delivers better deserialization performance and similar or somewhat better serialization performance to the pure javascript parser.
https://mongodb.github.io/node-mongodb-native/3.5/installation-guide/installation-guide/#bson-ext-module

Use Projections to Return Only Necessary Data
When you need only a subset of fields from documents, you can achieve better performance by returning only the fields you need:
For example, if in your query to the posts collection, you need only the timestamp, title, author, and abstract fields, you would issue the following command:
db.posts.find( {}, { timestamp : 1 , title : 1 , author : 1 , abstract : 1} ).sort( { timestamp : -1 } ).limit(10)
You can read for Query optimize here

Related

Dynamoose - cannot query and where with 2 keys

I have opened a related issue on GitHub, but maybe someone here will be able to help quicker.
Summary:
ValidationException: Query key condition not supported
I need to find records in last (amount) seconds on a given location.
Pretty simple, but already related to other issues:
One and another one
WORKS:
Activity.query('locationId').eq(locationId).exec();
DOES NOT WORK:
Activity.query('locationId').eq(locationId).where('createdAt').ge(date).exec();
Code sample:
Schema
const Activity = (dynamoose: typeof dynamooseType) => dynamoose.model<ActivityType, {}>('Activity',
new Schema({
id: {
type: String,
default: () => {
return uuid();
},
hashKey: true,
},
userId: {
type: String,
},
locationId: {
type: String,
rangeKey: true,
index: {
global: true,
},
},
createdAt: { type: Number, rangeKey: true, required: true, default: Date.now },
action: {
type: Number,
},
}, {
expires: 60 * 60 * 24 * 30 * 3, // activity logs to expire after 3 months
}));
Code which executes the function
Funny part is that I found this as workaround proposed to be used until they merge PR giving ability to specify timestamps as keys, but unfortunately it does not work.
async getActivitiesInLastSeconds(locationId: string, timeoutInSeconds: number) {
const Activity = schema.Activity(this.dynamoose);
const date = moment().subtract(timeoutInSeconds, 'seconds').valueOf();
return await Activity.query('locationId').eq(locationId)
.where('createdAt').ge(date).exec();
}
I suspect createdAt is not a range key of your table / index. You need to either do .filter('createdAt').ge(date) or modify your table / index schema.
I'm pretty sure the problem is that when you specifying rangeKey: true on the createdAt property you are telling that to be used on the global index (I don't think that is the correct term). That range key will be linked to the id property.
I believe the easiest solution would be to change your locationId index to be something like the following:
index: {
global: true,
rangeKey: 'createdAt',
},
That way you are being very explicit about which index you want to set createdAt as the rangeKey for.
After making that change please remember to sync your changes with either your local DynamoDB server or the actual DynamoDB service, so that the changes get populated both in your code and on the database system.
Hopefully this helps! If it doesn't fix your problem please feel free to comment and I'll help you further.

Mongoose's lean usage with populate and nested queries

Im coding an app in Node.js which is using MongoDB. I chose MongooseJS to handle my DB queries.
I have two collections that are referenced to each other (Room which is the 'superior' collection and DeviceGroups which is contained within Room collection).
I have a query that gets a list of all of the rooms from Room collection, populates deviceGroups field (which is the Rooms reference to DeviceGroup collection) and inside it there is a map method that goes through every room found in the Room collection and for every room it makes another query - it looks for any deviceGroups in DeviceGroup collection that are referenced to the current room in the map method.
My goal here is to return a list of all of the rooms with deviceGroups field filled in with actual data, not only references.
What I am getting after the queries (inside the then method) is a Mongoose document. The whole algorithm is used as a handler of a GET method, so I need a pure JavaScript object as a response.
Main goal I want to achieve is to get result of all of the queries and population inside them as pure javascript object, so I can create a response object and send it (i dont want to send everything that db returns, because not all of the data is needed for this case)
EDIT:
I am so sorry, I have deleted my code and didnt realize it.
My current code is below:
Schema:
const roomSchema = Schema({
name: {
type: String,
required: [true, 'Room name not provided']
},
deviceGroups: [{
type: Schema.Types.ObjectId,
ref: 'DeviceGroup'
}]
}, { collection: 'rooms' });
const deviceGroupSchema = Schema({
parentRoomId: {
type: Schema.Types.ObjectId,
ref: 'Room'
},
groupType: {
type: String,
enum: ['LIGHTS', 'BLINDS', 'ALARM_SENSORS', 'WEATHER_SENSORS']
},
devices: [
{
type: Schema.Types.ObjectId,
ref: 'LightBulb'
}
]
}, { collection: 'deviceGroups' });
Queries:
app.get('/api/id/rooms', function(req, res) {
Room.find({}).populate('deviceGroups').lean().exec(function(err, parentRoom) {
parentRoom.map(function(currentRoom) {
DeviceGroup.findOne({ parentRoomId: currentRoom._id }, function (err, devices) {
return devices;
});
});
}).then(function(roomList) {
res.send(roomList);
});
});
where are you confusing. here is a simple and effective code snippet
Room.findById(req.params.id)
.select("roomname")
.populate({
path: 'deviceGroup',
select: 'devicename',
model:'DeviceGroups'
populate:{
path: 'device',
select: 'devicename',
model:'Device'
}
})
.lean()
.exec((err, data)=>{
console.log(data);
})

Conditionally Update/Insert and Add To Array

I have .tsv file with some orders information. After remake into my script i got this.
[{"order":"5974842dfb458819244adbf7","name":"Сергей Климов","email":"wordkontent#gmail.com"},
{"order":"5974842dfb458819244adbf8","name":"Сушков А.В.","email":"mail#wwwcenter.ru"},
{"order":"5974842dfb458819244adbf9","name":"Виталий","email":"wawe2012#mail.ru"},
...
and so on
I have a scheema into mongoose.
var ClientSchema = mongoose.Schema({
name:{
type: String
},
email:{
type: String,
unique : true,
required: true,
index: true
},
forums:{
type: String
},
other:{
type: String
},
status:{
type: Number,
default: 3
},
subscribed:{
type: Boolean,
default: true
},
clienturl:{
type: String
},
orders:{
type: [String]
}
});
clienturl is an password 8 chars length, that generated by function.
module.exports.arrayClientSave = function(clientsArray,callback){
let newClientsArray = clientsArray
.map(function(x) {
var randomstring = Math.random().toString(36).slice(-8);
x.clienturl = randomstring;
return x;
});
console.log(newClientsArray);
Client.update( ??? , callback );
}
But i dont undestand how to make an update. Just if email already exsists push orders array, but not rewrite all other fields. But if email not exsists - save new user with clienturl and so on. Thanks!
Probably the best way to handle this is via .bulkWrite() which is a MongoDB method for sending "multiple operations" in a "single" request with a "single" response. This counters the need to control async functions in issue and response for each "looped" item.
module.exports.arrayClientSave = function(clientsArray,callback){
let newClientsArray = clientsArray
.map(x => {
var randomstring = Math.random().toString(36).slice(-8);
x.clienturl = randomstring;
return x;
});
console.log(newClientsArray);
let ops = newClientsArray.map( x => (
{ "updateOne": {
"filter": { "email": x.email },
"update": {
"$addToSet": { "orders": x.order },
"$setOnInsert": {
"name": x.name,
"clientUrl": x.clienturl
}
},
"upsert": true
}}
));
Client.bulkWrite(ops,callback);
};
The main idea there being that you use the "upsert" functionality of MongoDB to drive the "creation or update" functionality. Where the $addToSet only appends the "orders" property information to the array where not already present, and the $setOnInsert actually only takes effect when the action is actually an "upsert" and not applied when the action matches an existing document.
Also by applying this within .bulkWrite() this becomes a "single async call" when talking to a MongoDB server that supports it, and that being any version greater than or equal to MongoDB 2.6.
However the main point of the specific .bulkWrite() API, is that the API itself will "detect" if the server connected to actually supports "Bulk" operations. When it does not, this "downgrades" to individual "async" calls instead of one batch. But this is controlled by the "driver", and it will still interact with your code as if it were actually one request and response.
This means all the difficulty of dealing with the "async loop" is actually handled in the driver software itself. Being either negated by the supported method, or "emulated" in a way that makes it simple for your code to just use.

Sails.js not applying model scheme when using MongoDB

I'm going through the (excellent) Sails.js book, which discusses creating a User model User.js in Chapter 6 like so:
module.exports = {
connection: "needaword_postgresql",
migrate: 'drop',
attributes: {
email: {
type: 'string',
email: "true",
unique: 'string'
},
username: {
type: 'string',
unique: 'string'
},
encryptedPassword: {
type: 'string'
},
gravatarURL: {
type: 'string'
},
deleted: {
type: 'boolean'
},
admin: {
type: 'boolean'
},
banned: {
type: 'boolean'
}
},
toJSON: function() {
var modelAttributes = this.toObject();
delete modelAttributes.password;
delete modelAttributes.confirmation;
delete modelAttributes.encryptedPassword;
return modelAttributes;
}
};
Using Postgres, a new record correctly populates the boolean fields not submitted by the login form as null, as the book suggests should be the case:
But I want to use MongoDB instead of PostgreSQL. I had no problem switching the adaptor. But now, when I create a new record, it appears to ignore the schema in User.js and just put the literal POST data into the DB:
I understand that MongoDB is NoSQL and can take any parameters, but I was under the impression that using a schema in Users.js would apply to a POST request to the /user endpoint (via the blueprint routes for now) regardless of what database was sitting at the bottom. Do I need to somehow explicitly tie the model to the endpoint for NoSQL databases?
(I've checked the records that are created in Postgres and MongoDB, and they match the responses from localhost:1337/user posted above)
I understand that MongoDB is NoSQL
Good! In sails the sails-mongo waterline module is responsible for everything regarding mongodb. I think I found the relevant code: https://github.com/balderdashy/sails-mongo/blob/master/lib/document.js#L95 So sails-mongo simply does not care about non existent values. If you think this is bad then feel free to create an issue on the github page.
A possible workaround might be using defaultsTo:
banned : {
type : "boolean",
defaultsTo : false
}
You can configure your model to strictly use the schema with this flag:
module.exports = {
schema: true,
attributes: {
...
}
}
I eventually settled on performing the validations inside my controller.
// a signup form
create: async (req, res) => {
const { name, email, password } = req.body;
try {
const userExists = await sails.models.user.findOne({ email });
if (userExists) {
throw 'That email address is already in use.';
}
}

MongoDB query on populated fields

I have models called "Activities" that I am querying for (using Mongoose). Their schema looks like this:
var activitySchema = new mongoose.Schema({
actor: {
type: mongoose.Schema.ObjectId,
ref: 'User',
required: true
},
recipient: {
type: mongoose.Schema.ObjectId,
ref: 'User'
},
timestamp: {
type: Date,
default: Date.now
},
activity: {
type: String,
required: true
},
event: {
type: mongoose.Schema.ObjectId,
ref: 'Event'
},
comment: {
type: mongoose.Schema.ObjectId,
ref: 'Comment'
}
});
When I query for them, I am populating the actor, recipient, event, and comment fields (all the references). After that, I also deep-populate the event field to get event.creator. Here is my code for the query:
var activityPopulateObj = [
{ path: 'event' },
{ path: 'event.creator' },
{ path: 'comment' },
{ path: 'actor' },
{ path: 'recipient' },
{ path: 'event.creator' }
],
eventPopulateObj = {
path: 'event.creator',
model: User
};
Activity.find({ $or: [{recipient: user._id}, {actor: {$in: user.subscriptions}}, {event: {$in: user.attending}}], actor: { $ne: user._id} })
.sort({ _id: -1 })
.populate(activityPopulateObj)
.exec(function(err, retrievedActivities) {
if(err || !retrievedActivities) {
deferred.reject(new Error("No events found."));
}
else {
User.populate(retrievedActivities, eventPopulateObj, function(err, data){
if(err) {
deferred.reject(err.message);
}
else {
deferred.resolve(retrievedActivities);
}
});
}
});
This is already a relatively complex query, but I need to do even more. If it hits the part of the $or statement that says {actor: {$in: user.subscriptions}}, I also need to make sure that the event's privacy field is equal to the string public. I tried using $elemMatch, but since the event has to be populated first, I couldn't query any of its fields. I need to achieve this same goal in multiple other queries, as well.
Is there any way for me to achieve this further filtering like I have described?
The answer is to change your schema.
You've fallen into the trap that many devs have before you when coming into document database development from a history of using relational databases: MongoDB is not a relational database and should not be treated like one.
You need to stop thinking about foreign keys and perfectly normalized data and instead, keep each document as self-contained as possible, thinking about how to best embed relevant associated data within your documents.
This doesn't mean you can't maintain associations as well. It might mean a structure like this, where you embed only necessary details, and query for the full record when needed:
var activitySchema = new mongoose.Schema({
event: {
_id: { type: ObjectId, ref: "Event" },
name: String,
private: String
},
// ... other fields
});
Rethinking your embed strategy will greatly simplify your queries and keep the query count to a minimum. populate will blow your count up quickly, and as your dataset grows this will very likely become a problem.
You can try below aggregation. Look at this answer: https://stackoverflow.com/a/49329687/12729769
And then, you can use fields from $addFields in your query. Like
{score: {$gte: 5}}
but since the event has to be populated first, I couldn't query any of its fields.
No can do. Mongodb cannot do joins. When you make a query, you can work with exactly one collection at a time. And FYI all those mongoose populates are additional, distinct database queries to load those records.
I don't have time to dive into the details of your schema and application, but most likely you will need to denormalize your data and store a copy of whatever event fields you need to join on in the primary collection.

Categories

Resources