Trying to merge two collections together in meteor - javascript

I have two collections in my application that are parsed from two separate json files. I have inserted data from the two files into separate collections. The collections have corresponding numerical ID's and I want to match them up in a new collection. For example: the postmeta collection has a post_id value and the posts collection has a corresponding ID.
To explain this further here is a simple collections example. One thing to note is that there are over 730 collection posts and although there are matching ID's they are not sorted so when I view them they don't match each other.
The posts collection example:
{
"_id": "kTeQxenYZcQfPiaYv",
"ID": "44",
"post_content": "Today we talked about the letter Hh..."
}
The postsmeta collection example:
{
"_id": "otEGQYxvv6MkCABST",
"post_id": "44",
"meta_value": "http://www.mrskitson.ca/wp-content/uploads/2010/11/snackTime.jpg"
}
What I would like to do is parse through the collections and take for example posts collection where the ID matches the postsmeta collection. Once I find a match I want to insert the collections content (post_content & meta_value) into a new collection.
Here is all my code so far.
lib/collections/posts.js
Postsmeta = new Mongo.Collection('postsmeta');
Posts = new Mongo.Collection('posts');
server/publications.js
Meteor.publish('postsmeta', function() {
return Postsmeta.find();
});
Meteor.publish('posts', function() {
return Posts.find();
});
server/main.js
Meteor.startup(() => {
var postsmeta = JSON.parse(Assets.getText('postsmeta.json'));
var posts = JSON.parse(Assets.getText('posts.json'));
var length = postsmeta.length;
for(x=0; x < length; x++){
Posts.insert({
ID: posts[x].ID,
post_content: posts[x].post_content
});
Postsmeta.insert({
post_id: postsmeta[x].post_id,
meta_value: postsmeta[x].meta_value
});
}
});

Let's refactor your code a bit. We'll build the Postsmeta collection first and then jointly create the Posts and PostsCombined collections. Since Postsmeta will already exist we can just search inside it to find matching documents.
Meteor.startup(() => {
const postsmeta = JSON.parse(Assets.getText('postsmeta.json'));
postsmeta.forEach(doc => {
Postsmeta.insert({ post_id: doc.post_id, meta_value: doc.meta_value });
});
const posts = JSON.parse(Assets.getText('posts.json'));
posts.forEach(doc => {
const post = { ID: doc.ID, post_content: doc.post_content}
Posts.insert(post); // omit if you don't need the uncombined collection
const metadoc = Postsmeta.findOne({post_id: doc.ID}); // essentially a JOIN
if (metadoc) post.meta_value = metadoc.meta_value; // guard against no matching meta
PostsCombined.insert(post);
});
});
The following IDs are not present in your postsmeta data:
["56", "322", "521", "563", "583", "608", "625", "671", "707", "708",
"711", "713", "754", "758", "930", "1068", "1126", "1235", "1237", "1238",
"1239", "1246", "1249", "1256", "1263", "1355", "1375", "1678", "1680", "1763",
"1956", "2107", "2121", "2148", "2197", "2249"]

Do you want to put the collections together for consultation? because the insertion is correct for two different collections.
Tip one
If it is for query use the "find().map()", if you are using mongodb, within the function it will return the values ​​of each row of the first collection and soon you can call the other collection and check the id of the collection and return a JSON or Array of what you need. I do not pretend to do it that way, but it's a way of putting the two collections together.
Best solution
The correct way is not thinking as if noSql was a relational database like the other postgres, mysql and etc ... think that it is a dynamic bank, where in the same collection you can have everything you need at that moment, so I think You create a new collection that would be the junction of the two, when save saves the data in this other collection, which would be the query collection, and in that it would weigh less the query and until it would return the data faster, but suppose a 5x more faster than the above example ...
I hope I have helped, any questions or doubts I will be here. Hugs!

Related

mongoose find() sorting and organizing returned results from a product database in js

I have a problem with organizing my mongoDB data to send to my page in my res and cant figure out how to do the correct js. Here is a simplified version of my schema
var productSchema = new mongoose.Schema({
medium: String,
brand: String,
group: String
});
Here is what a typical entry looks like
medium :"Acrylic",
brand :"liquitex",
group :"heavy body"
there are many more entries in the schema, but these are the only ones I need to be able to sort and organize the returned results with. The problem is I have a route that returns all colors in my database and I want to be able to display them in sections on my page that are grouped under Brand, and then has the individual colors listed under the correct group.
The problem is there are paints from other brands that fall into the heavy body group and so when I use a filter function to sort my data by group, some brands get mixed together. I cant filter by brand, because some brands have acrylic and watercolor so then those get lumped together.
I need some way to filter the returned results of a
mongoose.find({})
that can use the group data as a filter, but then filter those results by the brands so they get separated into the correct brand categories.
I have this so far:
this is all a stripped down version of my app.js file:
//finds all colors in the DB
Color.find({}).lean().exec(function( err, colors)
var groups = [];
// find all groups in the databse
colors.forEach( function(color){
groups.push(color["group"]);
});
//returns only unique names to filter out duplicates
var groupTypes = Array.from(new Set(groups));
var tempVariableBrands = [];
// this sorts all returned paints into their respective group, but we get paints from multiple brands under the same group and that is not good
groupTypes.forEach( function(group){
var name = group;
var result = colors.filter(obj => { return obj.group === group });
tempVariable.push( {group : name, result } );
});
// the tempVariable gets sent to my page like so
res.render("landing", {colorEntry:tempVariable} );
and this works fine to allow me to display each paint by its grouping, but that fails when there is more than one paint from a different manufacturer that is considered the same group like a "heavy body". This is my ejs on my page that works fine:
<% colorEntry.forEach( function(entry){ %>
<div class="brandBlock">
<div class="brandTitle">
<span><%=entry.result[0].brand%> - <%=entry.result[0].group%></span>
I for the life of me cant seem to figure out the combination of filter() and maybe map() that would allow this kind of processing to be done.
My database has like 600 documents, colors from a number of different manufacturers and I don't know how to get this as a returned structure: lets say this is a few colors in the DB that get returned from a mongoose find:
[{ medium: "Oil",
brand: "Gamblin",
group: "Artists oil colors"},
{ medium: "Acrylic",
brand: "Liquitex",
group: "Heavy Body"},
{ medium: "Acrylic",
brand: "Golden",
group: "Heavy Body"}
]
i need to organize it like this or something similar. It can be anything that just sorts this data into a basic structure like this, I am not confined to any set standard or anything, this is just for personal use and a site I am trying to build to learn more.
returnedColors = [ { brand: "Gamblin", group: "Artists oil colors", { 50 paints colors returned} },
{ brand: "liquitex" , group: "heavy body", { 20 paint colors returned } },
{ brand: "golden" , group: "heavy body",{ 60 paint colors returned} }
];
I am not a web developer and only write some web code every 6 months or so and have been trying how to figure this out for the last 2 days. I can't wrap my head around some of the awesome filter and map combo's i have seen and cant get this to work.
Any help or advice would be great. I am sure there are many areas for improvement in this code, but everything was working up until I entered paints that were from different brands that had the same group type and i had to try to rewrite this sorting code to deal with it.
It boils down to needing to be able to iterate over the entire set of returned documents from the DB and then sort them based off 2 values.
UPDATE:
I was able to get something that works and returns the data in the format that I need to be able to send it to my ejs file and display it properly. The code is rather ugly and probably very redundant, but it technically works. It starts off by using the group value to run over paints since each set of paints will have a group name, but can sometimes share a group name with a paint from another brand like "heavy body".
groupTypes.forEach( function(group){
var name = group;
var result = colors.filter(obj => { return obj.group === group });
// this gets brand names per iteration of this loop so that we will know if more than one brand of paint
// has the same group identity.
var brands = [];
result.forEach( function(color){
brands.push(color["brand"]);
});
// This filters the brand names down to a unique list of brands
var brandNames = Array.from(new Set(brands));
// if there is more than one brand, we need to filter this into two separate groups
if( brandNames.length > 1){
//console.log("You have duplicates");
brandNames.forEach( x => {
var tmpResult = [...result];
var resultTmp = result.filter(obj => { return obj.brand === x });
result = resultTmp;
//console.log("FILTERED RESULT IS: ", result);
tempVariable.push( {brand: x ,group : name, result } );
result = [...tmpResult];
});
}else{
tempVariable.push( {brand: result[0].brand ,group : name, result } );
}
});
if anyone can reduce this to something more efficient, I would love to see the "better" way or "right" way of doing something like this.
UPDATE2
Thanks to the answer below, I was put on the right track and was able to rewrite a bunch of that long code with this:
Color.aggregate([
{
$sort: { name: 1}
},
{
$group: {
_id: { brand: '$brand', group: '$group' },
result: { $push: '$$ROOT' }
}
},
{ $sort: { '_id.brand': 1 } }
], function( err, colors){
if(err){
console.log(err);
}else{
res.render("landing", {colorEntry:colors, isSearch:1, codes: userCodes, currentUser: req.user, ads: vs.randomAds()} );
}
});
Much cleaner and appears to achieve the same result.
Since you're using MongoDB, "right" way is to utilize an Aggregation framework, precisely, $group stage.
Product.aggregate([{
$group: {
_id: { group: '$group', brand: '$brand' },
products: { $push: '$$ROOT' }
}
}])
This will output array of objects containing every combination of brand and group, and push all relevant products to corresponding subarray.
Combine it with $project and $sort stages to shape your data further.

mongoose mongodb - remove all where condition is true except one

If a collection have a list of dogs, and there is duplicate entries on some races. How do i remove all, but a single specific/non specific one, from just one query?
I guess it would be possible to get all from a Model.find(), loop through every index except the first one and call Model.remove(), but I would rather have the database handle the logic through the query. How would this be possible?
pseudocode example of what i want:
Model.remove({race:"pitbull"}).where(notFirstOne);
To remove all but one, you need a way to get all the filtered documents, group them by the identifier, create a list of ids for the group and remove a single id from
this list. Armed with this info, you can then run another operation to remove the documents with those ids. Essentially you will be running two queries.
The first query is an aggregate operation that aims to get the list of ids with the potentially nuking documents:
(async () => {
// Get the duplicate entries minus 1
const [doc, ...rest] = await Module.aggregate([
{ '$match': { 'race': 'pitbull'} },
{ '$group': {
'_id': '$race',
'ids': { '$push': '$_id' },
'id': { '$first': '$_id' }
} },
{ '$project': { 'idsToRemove': { '$setDifference': [ ['$id'], '$ids' ] } } }
]);
const { idsToRemove } = doc;
// Remove the duplicate documents
Module.remove({ '_id': { '$in': idsToRemove } })
})();
if purpose is to keep only one, in case of concurrent writes, may as well just write
Module.findOne({race:'pitbull'}).select('_id')
//bla
Module.remove({race:'pitbull', _id:{$ne:idReturned}})
If it is to keep the very first one, mongodb does not guarantee results will be sorted by increasing _id (natural order refers to disk)
see Does default find() implicitly sort by _id?
so instead
Module.find({race:'pitbull'}).sort({_id:1}).limit(1)

Mongodb sort by in array

Is there a way to sort by a given array?
something like this:
const somes = await SomeModel.find({}).sort({'_id': {'$in': [ObjectId('sdasdsd), ObjectId('sdasdsd), ObjectId('sdasdsd)]}}).exec()
What i looking for is a way to get a solution, to get all document of the collection and sort by if the document's _id match with one of the given array.
An example:
we have albums collection and songs collection. In albums collection we store the ids of the songs that belongs to the albums.
I want to get the songs, but if the song is in the album take them front of the array.
I solved this as follow, but its looks a bit hacky:
const songs = await SongMode.find({}).skipe(limit * page).limit(limit).exec();
const album = await AlbumModel.findById(id).exec();
if(album) {
songArr = album.songs.slice(limit * page);
for(let song of album.songs) {
songs.unshift(song);
songs.pop();
}
}
This cannot be accomplished using an ordinary .find().sort(). Instead, you will need to use the MongoDB aggregation pipeline (.aggregate()). Specifically, you will need to do the following:
Perform a $projection such that if the _id is $in the array, your new sort_field is given the value 1, otherwise it's given a value of 0.
Perform a $sort such that you're doing a descending sort on the new sort_field.
If you're using MongoDB version 3.4 or greater, then this is easy because of the $addFields operator:
const your_array_of_ids = [
ObjectId('objectid1'),
ObjectId('objectid2'),
ObjectId('objectid3')
];
SomeModel.aggregate([
{ '$addFields': {
'sort_field': { '$cond': {
'if': { '$in': [ '$_id', your_array_of_ids ] },
'then': 1,
'else': 0
}}
}},
{ '$sort': {
'sort_field': -1
}}
]);
If you're using an older version of MongoDB, then the solution is similar, but instead of $addFields you will be using $project. Additionally, you will need to explicitly include all of the other fields you want included, otherwise they will be excluded from the results.

Reorder array in MongoDB

I'm new to MongoDB, and trying to reorder an array in a db.
Here's the schema:
headline: String,
Galleryslides: [{type: ObjectId, ref: 'Galleryslide'}],
Here's the logic I'm using. By the way, correctOrder is an array with the new order of ids for the DB.
Gallery.findById(req.params.galleryId, function(err, gallery) {
var newArr = [];
req.body.ids.forEach(function(id, index) {
newArr[index] = Galleryslides.find({"_id" : id});
});
gallery.Galleryslides = newArr;
gallery.save(function() {
res.json({status: 'ok'});
});
});
When this runs, nothing happens - the order of the array in the DB does not change. D'you know a better way to do this?
In mongodb, the records are sorted in natural order. You should get them in the same order you inserted but that's not guaranteed.
Like the docs say :
This ordering is an internal implementation feature, and you should
not rely on any particular structure within it.
If you want to sort by the _id field, you can do that(it will sort by the _id index) :
Gallery.find().sort({ "_id": 1 })

Get count of siblings in subdocument with mongodb aggregate query

I have a document collection with a subdocument of tags.
{
title:"my title",
slug:"my-title",
tags:[
{tagname:'tag1', id:1},
{tagname:'tag2', id:2},
{tagname:'tag3', id:3}]
}
{
title:"my title2",
slug:"my-title2",
tags:[
{tagname:'tag1', id:1},
{tagname:'tag2', id:2}]
}
{
title:"my title3",
slug:"my-title3",
tags:[
{tagname:'tag1', id:1},
{tagname:'tag3', id:3}]
}
{
title:"my title4",
slug:"my-title4",
tags:[
{tagname:'tag1', id:1},
{tagname:'tag2', id:2},
{tagname:'tag3', id:3}]
}
[...]
Getting a count of every tag is quite simple with an $unwind + group count aggregate
However, I would like to find a count of which tags are found together, or more precisely, which sibling shows up most often beside one another, ordered by count. I have not found an example nor can I figure out how to do this without multiple queries.
Ideally the end result would be:
{'tag1':{
'tag2':3, // tag1 and tag2 were found in a document together 3 times
'tag3':3, // tag1 and tag3 were found in a document together 3 times
[...]}}
{'tag2':{
'tag1':3, // tag2 and tag1 were found in a document together 3 times
'tag3':2, // tag2 and tag3 were found in a document together 2 times
[...]}}
{'tag3':{
'tag1':3, // tag3 and tag1 were found in a document together 3 times
'tag2':2, // tag3 and tag2 were found in a document together 2 times
[...]}}
[...]
As stated earlier it just simply is not possible to have the aggregation framework generate arbitrary key names from data. It's also not possible to do this kind of analysis in a single query.
But there is a general approach to doing this over your whole collection for an undetermined number of tag names. Essentially you are going to need to get a distinct list of the "tags" and process another query for each distinct value to get the "siblings" to that tag and the counts.
In general:
// Get a the unique tags
db.collection.aggregate([
{ "$unwind": "$tags" },
{ "$group": {
"_id": "$tags.tagname"
}}
]).forEach(function(tag) {
var tagDoc = { };
tagDoc[tag._id] = {};
// Get the siblings count for that tag
db.collection.aggregate([
{ "$match": { "tags.tagname": tag._id } },
{ "$unwind": "$tags" },
{ "$match": { "tags.tagname": { "$ne": tag._id } } },
{ "$group": {
"_id": "$tags.tagname",
"count": { "$sum": 1 }
}}
]).forEach(function(sibling) {
// Set the value in the master document
tagDoc[tag._id][sibling._id] = sibling.count;
});
// Just emitting for example purposes in some way
printjson(tagDoc);
});
The aggregation framework can return a cursor in releases since MongoDB 2.6, so even with a large number of tags this can work in an efficient way.
So that's the way you would handle this, but there really is no way to have this happen in a single query. For a shorter run time you might look at frameworks that allow many queries to be run in parallel either combining the results or emitting to a stream.

Categories

Resources