Nodejs + mongodb : How to query $ref fields?

Nodejs + mongodb : How to query $ref fields? - javascript

I'am using MongoDB with a nodejs REST service which exposes my data stored inside. I have a question about how to interrogate my data which uses $ref.
Here is a sample of an Object which contains a reference to another object (detail) in anther collection :
{
"_id" : ObjectId("5962c7b53b6a02100a000085"),
"Title" : "test",
"detail" : {
"$ref" : "ObjDetail",
"$id" : ObjectId("5270c7b11f6a02100a000001")
},
"foo" : bar
}
Actually, using Node.js and mongodb module, I do the following :
db.collection("Obj").findOne({"_id" : new ObjectID("5962c7b53b6a02100a000085"},
function(err, item) {
db.collection(item.$ref).findOne({"_id" : item.$id}, function(err,subItem){
...
});
});
In fact I make 2 queries, and get 2 objects. It's a kind of "lazy loading" (not exactly but almost)
My question is simple : is it possible to retrieve the whole object graph in one query ?
Thank you

No, you can't.
To resolve DBRefs, your application must perform additional queries to return the referenced documents. Many drivers have helper methods that form the query for the DBRef automatically. The drivers do not automatically resolve DBRefs into documents.
From the MongoDB docs http://docs.mongodb.org/manual/reference/database-references/.

Is it possible to fetch parent object along with it's $ref using single MongoDB query?
No, it's not possible.
Mongo have no inner support for refs, so it up to your application to populate them (see Brett's answer).
But is it possible to fetch parent object with all its ref's with a single node.js command?
Yes, it's possible. You can do it with Mongoose. It has build-in ref's population support. You'll need to change your data model a little bit to make it work, but it's pretty much what you're looking for. Of course, to do so Mongoose will make the same two MongoDB queries that you did.

Answer of Vladimir is not still valid as the db.dereference method was deleted from MongoDB Nodejs API:
https://www.mongodb.com/blog/post/introducing-nodejs-mongodb-20-driver
The db instance object has been simplified. We've removed the following methods:
db.dereference due to db references being deprecated in the server

No, very few drivers for MongoDb include special support for a DBRef. There are two reasons:
MongoDb doesn't have any special commands to make retrieval of referenced documents possible. So, drivers that do add support are artificially populating the resulting objects.
The more, "bare metal" the API, the less it makes sense. In fact, as. MongoDb collections are schema-less, if the NodeJs driver brought back the primary document with all references realized, if the code then saved the document without breaking the references, it would result in an embedded subdocument. Of course, that would be a mess.
Unless your field values vary, I wouldn't bother with a DBRef type and would instead just store the ObjectId directly. As you can see, a DBRef really offers no benefit except to require lots of duplicate disk space for each reference, as a richer object must stored along with its type information. Either way, you should consider the potentially unnecessary overhead of storing a string containing the referenced collection's documents.
Many developers and MongoDb, Inc. have added an object document mapping layer on top of the existing base drivers. One popular option for MongoDb and Nodejs is Mongoose. As the MongoDb server has no real awareness of referenced documents, the responsibility of the references moves to the client. As it's more common to consistently reference a particular collection from a given document, Mongoose makes it possible to define the reference as a Schema. Mongoose is not schema-less.
If you accept having and using a Schema is useful, then Mongoose is definitely worth looking at. It can efficiently fetch a batch of related documents (from a single collection) from a set of documents. It always is using the native driver, but it generally does operations extremely efficiently and takes some of the drudgery out of more complex application architectures.
I would strongly suggest you have a look at the populate method (here) to see what it's capable of doing.
Demo /* Demo would be a Mongoose Model that you've defined */
.findById(theObjectId)
.populate('detail')
.exec(function (err, doc) {
if (err) return handleError(err);
// do something with the single doc that was returned
})
If instead of findById, which always returns a single document, find were used, with populate, all returned documents' details property will be populated automatically. It's smart too that it would request the same referenced documents multiple times.
If you don't use Mongoose, I'd suggest you consider a caching layer to avoid doing client side reference joins when possible and use the $in query operator to batch as much as possible.

I reach the desired result with next example:
collection.find({}, function (err, cursor) {
cursor.toArray(function (err, docs) {
var count = docs.length - 1;
for (i in docs) {
(function (docs, i) {
db.dereference(docs[i].ref, function(err, doc) {
docs[i].ref = doc;
if (i == count) {
(function (docs) {
console.log(docs);
})(docs);
}
});
})(docs, i)
}
});
});
Not sure that it solution is best of the best, but It is simplest solution that i found.

Related

Firebase realtime database don't retrieve specified child

I have db structure like this:
datas
-data1
--name
--city
--date
--logs
---log1
---log2
---log3
-data2
--name
...
Now, I released putting 'logs' inside 'data' parent was a huge mistake because its user generated child and growing up fast (so much data under it) and causes delay on downloading 'data1' parent naturally.
Normally I am pulling 'data1' with this:
database().ref('datas/' + this.state.dataID).on('value', function(snapshot) {
... })
I hope i could explain my problem, I just basically ignore 'logs' child (I need name,city,date)
As there project started and users already using this, I need a proper way.
Is there a way to do this on firebase side ?

I don't think you'll have an easy way out of this one...
Queries are deep by default: they always return the entire subtree.
https://firebase.google.com/docs/firestore/rtdb-vs-firestore#querying
I can see only two options:
Migrate the logs to a different location (if it's really a huge amount of data, you could use something like BiqQuery https://cloud.google.com/bigquery or if it's events, you could store them in Google Analytics, it really depends on the volume and type of logs)
Attach multiple listeners instead of a single one (depending on the amount of entries that might be a viable interim solution):
let response={
name:null,
city:null,
date:null
}
const refs = ['name', 'city', 'date'].map(key=>database().ref(`datas/${this.state.dataID}/${key}')
refs.forEach(ref=>ref.on('value',snapshot=>{
})

Insert documents in different collections by calling one API

Im doing a project where i call one api, this given API service will handle all the given data and split it to different collections.
Given example
service.js
async function create(params, origin) {
const { param1 , param2 , param3 } = params
const collection1 = new db.Collection1(param1)
await collection1.save()
const collection2 = new db.Collection2(param2)
await collection2.save()
const collection3 = new db.Collection3(param3)
await collection3.save()
}
My questions are:
What is the best practice? Should I create a general model schema that groups all the collections with the parameters, "param1,param2,param3", and insert it in the collection then call another function that splits all the values into Collection1,Collection2....
How can i handle if one collection insert throws an error, to delete all the previously inserted collections ? For example the values of param2 is not valid, but param1 and param3 are, how can i handle to delete the collection 1 and 3 and throw an error ? Whats the best way to achieve this?
Generally all the above examples are simplified, we are talking about at least more than 10 collections and more than 15 parameters.

Basically you are talking about having multiple route handlers for a single path.
Generally you should handle server-side validation & sanitation on the input data before inserting into the db and throw errors right away if rules don't match, so having to delete previous 2 collection inserts in case the 3rd one fails is not needed anymore.
Check out express-validator middleware for this aspect.
Ideally you should have one route handler per path for several reasons but I think the most common one is ease of maintenance and debugging(conceptually like separation of concerns). It's easier to execute a sequence of requests to different paths and eventually await the response from the first request to be used in the next request and so on(if that's the case). In my opinion you're just adding a layer of complexity not needed.
It might work if you develop alone as a full-stack, but if you have a team where you do the back-end and somebody else does the requests from the front-end and he encounters a problem it'll be much harder for him to tell you which path => handler failed, because you're basically hiding multiple handlers into a single one path => [handler1, halder2, handler3]. If you think about it, this behaviour is causing your second question.
Another thing is, what do you do if somebody needs to operate just a single insert from that array of inserts you're trying to do? You'll probably end up creating separate paths/routes meaning you are copying existing code.
I think it's better for chaining/sequencing different request from the front-end. It's much better and elegant, follows DRY, validation and sanitation is indeed easier to code and it gives the consumer of your api freedom of composition.

JavaScript Object vs minimongo efficiency

My Meteor client receives data from the server and stores it in minimongo. This data is guaranteed not to change during their session, so I don't need Meteor's reactivity. The static data just happens to arrive by that route; let's just take that as a given.
The data looks like this:
{_id: 'abc...', val: {...}}
On the client, is it more efficient for me to look up values using:
val_I_need = Collection.findOne({id})
or to create a JavaScript object:
data = {}
Collection.find().fetch().map((x) => {data[x._id] = x.val})
and use it for look ups:
val_I_need = data[id]
Is there a tipping point, either in terms of the size of the data or the number of look ups, where the more efficient method changes, or outweighs the initial cost of building the object?

FindOne may be more efficient on larger datasets because it looks up using cursors where _id is an indexed key while your find().fetch() approach requires to get all docs and then iterate manually by mapping.
Note, that findOne could also be replaced by .find({_id:desiredId}).fetch()[0](assuming it returns the desired doc).
More on this in the mongo documentation on query performance.
However, if it concerns only one object that is afterwards not reactively tracked, I would rather load it via a "findOne"-returning method from the server:
export const getOne = new ValidatedMethod({
name: "getOne",
validate(query) {
// validate query schema
// ...
},
run(query) {
// CHECK PERMISSIONS
// ...
return MyCollection.findOne(query);
});
This avoids using publications / subscriptions and thus minimongo for this collection on the current client template. Think about that pub/sub has already some reactivity initialized to observe the collection and thus eats up some computation somewhere.

My gut feeling is that you'll never hit a point where the performance gain of putting it in an object makes a noticeable difference.
It's more likely that your bottleneck will be in the pub/sub mechanism, as it can take a while to send all documents to the client.
You'll see a much more noticeable difference for a large dataset by retrieving the data using a Meteor method.
At which point you've got it in a plain old javascript object anyway and so end up with the small performance gain of native object lookups as well.

Load includes on existing model

I'm trying to load includes on an existing model in sequelize. In express we pre check the models to see if they exist in the middleware.
So once we're in the actual "controller" we want to run some includes on that existing model that is passed in.
req.models.item.incude([
{model: Post, as: 'posts'}
])
Is there any way to accomplish this?
EDIT:
I know we can do something like this.
return req.models.item.getThing()
.then(function (thing) {
req.models.item.thing = thing;
return req.models.item;
});
But:
My expansions for includes are a dynamic property that come via url parameters, so they are not know ahead of time.
It I return the above you will not see the "thing" in the response. I need it nicely built as part of the original instance.
Something like a .with('thing', 'other.thing'); notation would be nice. Or in the case of sequelize .with({include: ...}); or .include([{model: ...}]);

If the variable req.models.item is already an Instance but without its other related instances ("includes"), then you could include them using something like the following code:
Item.findAll({
where: req.models.item.where(),
include: [{
model: SomeAssociateModel,
}]
})
.then(function(itemWithAssoc) {
// itemWithAssoc is an Instance for the same DB record as item, but with its associations
});
See here for some documentation. See here for a script demo'ing this.
Update: Given the instance, how do I just get the associated models?
To do this just use the automatically generated "getAssociation" getter functions, e.g.:
function find_associations_of_instance(instance) {
return instance.getDetails();
}
I've updated the script to include this as an example. For more information on these functions, see the SequelizeJS docs.

Loading related mongoose data

Despite being a programmer for decades, I am really struggling with this synchronous, nested callback style of programming.
I have a schema like this (rails parlance)
Property has many Photos
Property has many Suites
Suites has many Photos
When a user visits a property, I can get the Property and Suites just fine.
Property.findOne( {key: locals.filters.property}).exec(function(err, property) {
locals.data.property = property;
Photo.find( {property: property._id }).exec(function(err,photos) {
locals.data.photos = photos;
Suites.find( { property: property._id }).exec(function(err,suites) {
locals.data.suites = suites;
next(err);
});
});
});
This loads into req arrays fine right now.
My preferred method would be to have document methods resolve to the correct relationship documents, so in the template I could just do property.photos.each instead of separate unrelated arrays. Yet I'm not sure if that's "the node way". I need to keep the mongoose objects intact.
But how do I iterate over the suites and populate the photos? Mongoose populate is not an option because there are not references on both sides. Mongoose methods and virtuals are synchronous.
I know I am kind of forcing relational style relationships here, but I am working within the constraints of keystone.js so embedding is not an option.
The dataset is very small. Plan B is to load everything up in the middleware and underscore slice and dice in the route.

Develop Reference

JavaScript is the programming language of the Web.