I have opened a related issue on GitHub, but maybe someone here will be able to help quicker.
Summary:
ValidationException: Query key condition not supported
I need to find records in last (amount) seconds on a given location.
Pretty simple, but already related to other issues:
One and another one
WORKS:
Activity.query('locationId').eq(locationId).exec();
DOES NOT WORK:
Activity.query('locationId').eq(locationId).where('createdAt').ge(date).exec();
Code sample:
Schema
const Activity = (dynamoose: typeof dynamooseType) => dynamoose.model<ActivityType, {}>('Activity',
new Schema({
id: {
type: String,
default: () => {
return uuid();
},
hashKey: true,
},
userId: {
type: String,
},
locationId: {
type: String,
rangeKey: true,
index: {
global: true,
},
},
createdAt: { type: Number, rangeKey: true, required: true, default: Date.now },
action: {
type: Number,
},
}, {
expires: 60 * 60 * 24 * 30 * 3, // activity logs to expire after 3 months
}));
Code which executes the function
Funny part is that I found this as workaround proposed to be used until they merge PR giving ability to specify timestamps as keys, but unfortunately it does not work.
async getActivitiesInLastSeconds(locationId: string, timeoutInSeconds: number) {
const Activity = schema.Activity(this.dynamoose);
const date = moment().subtract(timeoutInSeconds, 'seconds').valueOf();
return await Activity.query('locationId').eq(locationId)
.where('createdAt').ge(date).exec();
}
I suspect createdAt is not a range key of your table / index. You need to either do .filter('createdAt').ge(date) or modify your table / index schema.
I'm pretty sure the problem is that when you specifying rangeKey: true on the createdAt property you are telling that to be used on the global index (I don't think that is the correct term). That range key will be linked to the id property.
I believe the easiest solution would be to change your locationId index to be something like the following:
index: {
global: true,
rangeKey: 'createdAt',
},
That way you are being very explicit about which index you want to set createdAt as the rangeKey for.
After making that change please remember to sync your changes with either your local DynamoDB server or the actual DynamoDB service, so that the changes get populated both in your code and on the database system.
Hopefully this helps! If it doesn't fix your problem please feel free to comment and I'll help you further.
Related
I new to MongoDB/Mongoose, and work with a very large database (more than 25000 docs). I need to configure different queries: by fields, first 10 docs, one by id. The problem is with performance - the server responce is too slow (about 10-15 seconds).
Please tell me how to configure this so that the server response is fast?
Does it depend only on the schema settings, or it can also depend on other things, such as database connection parameters, or query parameters?
P.S. Queries should be by 'district' and 'locality'.
Thanks for any help!
Here is the schema:
const mongoose = require('mongoose');
const Schema = mongoose.Schema;
const houseSchema = new Schema({
code: {
type: String,
required: false
},
name: {
type: String,
required: true
},
district: {
type: String,
required: true
},
locality: {
type: String,
required: false
},
recountDate: {
type: Date,
default: Date.now
},
eventDate: {
type: Date,
default: Date.now
},
events: {
type: Array,
default: []
}
});
module.exports = mongoose.model('House', houseSchema);
Connection parameters:
mongoose.connect(
`mongodb+srv://${process.env.MONGO_USER}:${process.env.MONGO_PASSWORD}#cluster0-vuauc.mongodb.net/${process.env.MONGO_DB}?retryWrites=true&w=majority`,
{
useNewUrlParser: true,
useUnifiedTopology: true
}
).then(() => {
console.log('Connection to database established...')
app.listen(5555);
}).catch(err => {
console.log(err);
});
Queries are performed using Relay:
query {
viewer {
allPosts (first: 10) {
edges {
node {
id
code
district
locality
recountDate
eventDate
events
}
}
}
}
}
MongoDB is very fast in the execution of queries. But it also depends on how you write your query. For getting the first 10 documents and sort it descending order to the _id from a collection. You need to use limit & sort in your query.
db.collectionName.find({}).limit(10).sort({_id:-1})
Make sure it's not a connection issue. Try to run your query from MongoDB shell
mongo mongodb+srv://${process.env.MONGO_USER}:${process.env.MONGO_PASSWORD}#cluster0-vuauc.mongodb.net/${process.env.MONGO_DB}?retryWrites=true&w=majority
db.collection.find({condition}).limit(10)
If in MongoDB shell it responds faster than Mongoose:
There is an issue for Node.js driver which uses pure Javascript BSON serializer which is very slow to serialize from BSON to JSON.
Try to install bson-ext
The bson-ext module is an alternative BSON parser that is written in C++. It delivers better deserialization performance and similar or somewhat better serialization performance to the pure javascript parser.
https://mongodb.github.io/node-mongodb-native/3.5/installation-guide/installation-guide/#bson-ext-module
Use Projections to Return Only Necessary Data
When you need only a subset of fields from documents, you can achieve better performance by returning only the fields you need:
For example, if in your query to the posts collection, you need only the timestamp, title, author, and abstract fields, you would issue the following command:
db.posts.find( {}, { timestamp : 1 , title : 1 , author : 1 , abstract : 1} ).sort( { timestamp : -1 } ).limit(10)
You can read for Query optimize here
I have .tsv file with some orders information. After remake into my script i got this.
[{"order":"5974842dfb458819244adbf7","name":"Сергей Климов","email":"wordkontent#gmail.com"},
{"order":"5974842dfb458819244adbf8","name":"Сушков А.В.","email":"mail#wwwcenter.ru"},
{"order":"5974842dfb458819244adbf9","name":"Виталий","email":"wawe2012#mail.ru"},
...
and so on
I have a scheema into mongoose.
var ClientSchema = mongoose.Schema({
name:{
type: String
},
email:{
type: String,
unique : true,
required: true,
index: true
},
forums:{
type: String
},
other:{
type: String
},
status:{
type: Number,
default: 3
},
subscribed:{
type: Boolean,
default: true
},
clienturl:{
type: String
},
orders:{
type: [String]
}
});
clienturl is an password 8 chars length, that generated by function.
module.exports.arrayClientSave = function(clientsArray,callback){
let newClientsArray = clientsArray
.map(function(x) {
var randomstring = Math.random().toString(36).slice(-8);
x.clienturl = randomstring;
return x;
});
console.log(newClientsArray);
Client.update( ??? , callback );
}
But i dont undestand how to make an update. Just if email already exsists push orders array, but not rewrite all other fields. But if email not exsists - save new user with clienturl and so on. Thanks!
Probably the best way to handle this is via .bulkWrite() which is a MongoDB method for sending "multiple operations" in a "single" request with a "single" response. This counters the need to control async functions in issue and response for each "looped" item.
module.exports.arrayClientSave = function(clientsArray,callback){
let newClientsArray = clientsArray
.map(x => {
var randomstring = Math.random().toString(36).slice(-8);
x.clienturl = randomstring;
return x;
});
console.log(newClientsArray);
let ops = newClientsArray.map( x => (
{ "updateOne": {
"filter": { "email": x.email },
"update": {
"$addToSet": { "orders": x.order },
"$setOnInsert": {
"name": x.name,
"clientUrl": x.clienturl
}
},
"upsert": true
}}
));
Client.bulkWrite(ops,callback);
};
The main idea there being that you use the "upsert" functionality of MongoDB to drive the "creation or update" functionality. Where the $addToSet only appends the "orders" property information to the array where not already present, and the $setOnInsert actually only takes effect when the action is actually an "upsert" and not applied when the action matches an existing document.
Also by applying this within .bulkWrite() this becomes a "single async call" when talking to a MongoDB server that supports it, and that being any version greater than or equal to MongoDB 2.6.
However the main point of the specific .bulkWrite() API, is that the API itself will "detect" if the server connected to actually supports "Bulk" operations. When it does not, this "downgrades" to individual "async" calls instead of one batch. But this is controlled by the "driver", and it will still interact with your code as if it were actually one request and response.
This means all the difficulty of dealing with the "async loop" is actually handled in the driver software itself. Being either negated by the supported method, or "emulated" in a way that makes it simple for your code to just use.
I am using mongodb as database with mongoose as ORM. I have a field booking_id in my schema which is unique , so I cannot have it null. Thus I have designed my code something like this.
var bookingSchema = new Schema({
booking_id_customer: {
type: Number,
default : Math.floor(Math.random()*900000000300000000000) + 1000000000000000,
index: { unique: true }
},
It works perfectly for the first time, but from 2nd time onwards I get this duplicacy error.
{ [MongoError: E11000 duplicate key error index: xx.bookings.$booking_id_customer_1 dup key: { : 4.439605615108491e+20 }]
name: 'MongoError',
message: 'E11000 duplicate key error index:
I expect it to generate random numbers but I have no clue about whats going wrong in 2nd time.
You are setting the default just once, at schema creation.
If you want it to be called for each new document, you need to turn it into a function that Mongoose will call:
default : function() {
return Math.floor(Math.random()*900000000300000000000) + 1000000000000000
}
However, there is another issue with your code: the values you're using (900000000300000000000 and 1000000000000000) exceed Number.MAX_SAFE_INTEGER, which can lead to problems.
I would suggest using mongoose.Types.ObjectId as id generator, which is also what Mongoose and MongoDB use to create (unique) document id's:
booking_id_customer : {
type : mongoose.Schema.Types.ObjectId,
default : mongoose.Types.ObjectId,
index : { unique: true }
}
Or re-use the _id property of the document, which is also unique.
Im about to make a huge schema for a form that I have just built... that being said does my schema order have to mimic the form order, or can it just have all the inputs in any order I put them in ?
Example below.
can it be like this?
// link to mongoose
var mongoose = require('mongoose');
// define the article schema
var mapSchema = new mongoose.Schema({
created: {
type: Date,
default: Date.now
},
dd1: {
type: String,
default: ''
},
dd2: {
type: String,
default: ''
},
com1: {
type: String,
default: ''
},
com2: {
type: String,
default: ''
}
});
// make it public
module.exports = mongoose.model('Map', mapSchema);
Or does it have to be like this?
// link to mongoose
var mongoose = require('mongoose');
// define the article schema
var mapSchema = new mongoose.Schema({
created: {
type: Date,
default: Date.now
},
dd1: {
type: String,
default: ''
},
com1: {
type: String,
default: ''
},
dd2: {
type: String,
default: ''
},
com2: {
type: String,
default: ''
}
});
// make it public
module.exports = mongoose.model('Map', mapSchema);
does my schema order have to mimic the form order, or can it just have all the inputs in any order I put them in?
mongoose.Schema accepts a JavaScript object as its parameter. So your question boils down to:
Are JavaScript objects aware of the order their keys were defined in?
The answer to that is: No, key order is not maintained in JavaScript objects. The JS spec explicitly states that objects are unordered key/value collections. (compare)
Therefore it follows that mongoose.Schema could not rely on key order even if it tied to, which means you are free to order the keys in any way you like.
We can also tackle the question from the other end:
Is it likely that a front-end change like form field order forces me to rewrite my database backend code?
And the answer to that is: No, that is pretty darn unlikely. We can dismiss that thought without even looking into any kind of spec, because it would not make any kind of sense.
First of all: I'm using Mongo 2.6 and Mongoose 3.8.8
I have the follow Schema:
var Link = new Schema({
title: { type: String, trim: true },
owner: { id: { type: Schema.ObjectId }, name: { type: String } },
url: { type: String, default: '', trim: true},
stars: { users: [ { name: { type: String }, _id: {type: Schema.ObjectId} }] },
createdAt: { type: Date, default: Date.now }
});
And my collection already have 500k documents.
What I need is sort the documents using a custom strategy. My initial solution was use the aggregate framework.
var today = new Date();
//fx = (TodayDay * TodayYear) - ( DocumentCreatedDay * DocumentCreatedYear)
var relevance = { $subtract: [
{ $multiply: [ { $dayOfYear: today }, { $year: today } ] },
{ $multiply: [ { $dayOfYear: '$createdAt' }, { $year: '$createdAt' } ] }
]}
var projection = {
_id: 1,
url: 1,
title: 1,
createdAt: 1,
thumbnail: 1,
stars: { $size: '$stars.users'}
ranking: { $multiply: [ relevance, { $size: '$stars.users' } ] }
}
var sort = {
$sort: { ranking: 1, stars: 1 }
}
var page = 1;
var limit = { $limit: 40 }
var skip = { $skip: ( 40 * (page - 1) ) }
var project = { $project: projection }
Link.aggregate([project, sort, limit, skip]).exec(resultCallback);
It works nicely until 100k, after that the query is getting slow and slow.
How I could accomplish that ?
Redesign ?
Wrong use of projection Am I doing ?
Thanks for your time !
You can do all of this as you update and then you can actually index on ranking and use range queries in order to implement your paging. Much better than the use of $skip and $limit which in any form is bad news for large data. You should be able to find many sources that confirm that skip and limit is a poor practice for paging.
The only catch here is since you cannot use an .update() type of statement to actually refer to the existing value of another field, you have to be careful with concurrency issues on updates. This required "rolling in" some custom lock handling which you can do with the .findOneAndUpdate() method:
Link.findOneAndUpdate(
{ "_id": docId, "locked": false },
{ "locked": true },
function(err,doc) {
if ( doc.locked.true ) {
// then update your document
// I would just use the epoch date difference per day
var relevance = (
( Date.now.valueOf() - ( Date.now().valueOf() % 1000 * 60 * 60 * 24) )
- ( doc.createdAt.valueOf() - ( doc.createdAt.valueOf() % 1000 * 60 * 60 * 24 ))
);
var update = { "$set": { "locked": false } };
if ( actionAdd ) {
update["$push"] = { "stars.users": star };
update["$set"]["score"] = relevance * ( doc.stars.users.length +1 );
} else {
update["$pull"] = { "stars.users": star };
update["$set"]["score"] = relevance * ( doc.stars.users.length -1 );
}
// Then update
Link.findOneAndUpdate(
{ "_id": doc._id, "locked": update,function(err,newDoc) {
// possibly check that new "locked" is false, but really
// that should be okay
});
} else {
// some mechanism to retry "n" times at interval
// or report that you cannot update
}
}
)
The idea there is that you can only grab a document with a "locked" status equal to false in order to actually update, and the first "update" operation just sets that value to true so that no other operation could update the document until this completes.
As per the code comments, you probably want to have a few tries at doing this rather than just failing the update as there could be another operation adding or subtracting from the array.
Then depending on the "mode" of your current update if you are either adding to the array or taking an item off of there you simply alter the update statement to be issued to do either operation and set the appropriate "score" value in your document.
The update will then of course set the "locked" status to false and it makes sense to check that the current status is not true though it really should be okay at this point. But this gives you some room on being able to raise exceptions.
That manages the general update situation but you still have a problem with sorting out your "ranking" order here as skip and limit are still not what you want for performance. That is probably best handled by a periodic update of yet another field which you can use for a definitive "range" query, but you probably only really want to be concerned with the the most "relevant" score range in a set range of pages, rather than update the whole collection.
The update needs to be periodic as you will have concurrency problems if you try to change the "ranking" order of multiple documents in individual updates. So you need to make sure this process does not overlap with another such update.
As a final note consider your "score" calculation as what you really want is the newest and "most starred" content at the top. The current calculation has some flaws there such as on the same day and 0 "stars", but I'll leave that to you to work out.
This is essentially what you need to do for your solution. Trying to do this dynamically on a large collection using the aggregation framework is not going to produce favorable performance for your application experience. So there are few pointers here to things you can do to more efficiently maintain the order of your results.