DynamoDB: Query only every 10th value - javascript

I am querying data between two specific unixtime values. for example:
all data between 1516338730 (today, 6:12) and 1516358930 (today, 11:48)
my database receives a new record every minute. Now, when i want to query the data of last 24h, its way too dense. every 10th minute would be perfect.
my question now is: how can i read only every 10th database record, using DynamoDB?
As far as i know, theres no posibility to use modulo or something similar that pleases my needs.
This is my AWS Lambda Code so far:
var read = {
TableName: "user",
ProjectionExpression:"#time, #val",
KeyConditionExpression: "Id = :id and TIME between :time_1 and :time_2",
ExpressionAttributeNames:{
"#time": "TIME",
"#val": "user_data"
},
ExpressionAttributeValues: {
":id": event, // primary key
":time_1": 1516338730,
":time_2": 1516358930
},
ScanIndexForward: true
};
docClient.query(read, function(err, data) {
if(err) {
callback(err, null);
}
else {
callback(null, data.Items);
}
});
};

You say that you insert 1 record every minute?
The following might be an option:
At the time of insertion, set another field on the record, let's call it MinuteBucket, which is calculated as the timestamp's minute value mod 10.
If you do this via a stream function, you can handle new records, and then write something to touch old records to force a calculation.
Your query would change to this:
/*...snip...*/
KeyConditionExpression: "Id = :id and TIME between :time_1 and :time_2 and MinuteBucket = :bucket_id",
/*...snip...*/
ExpressionAttributeValues: {
":id": event, // primary key
":time_1": 1516338730,
":time_2": 1516358930,
":bucket_id": 0 //can be 0-9, if you want the first record to be closer to time_1, then set this to :time_1 minute value mod 10
},
/*...snip...*/
Just as a follow-up thought: if you want to speed up your queries, perhaps investigate using the MinuteBucket in an index, though that might come at a higher price.

I don't think that it is possible with dynamoDB API.
There are FilterExpression that contains conditions that DynamoDB applies after the Query operation, but before the data is returned to you.
But AFAIK it isn't possible to use a custom function. And build-in functions are poor.
As a workaround, you could mark each 10th item on the client side. And then query with checking attribute_exists (or attribute value) to filter them.
BTW, it would be nice to create the index for 'Id' attribute with sort key 'TIME' for improving query performance.

Related

Sorting by partition key/sort key does not work

I'm working on an IoT project where I need to read some data from a device.
I use AWS, and I'm currently working on some lambda function code. But I can't figure out how to get the last (newest) item from my database.
My database has two keys:
Partition key:
device_id (Number)
Sort key
sample_time (Number)
This is a part of the code I wrote to retrieve the newest reading from my IoT device
case "GET /data/newest":
body = await dynamo
.query({
TableName: "bikelock_db",
KeyConditionExpression: 'device_id = :id',
ExpressionAttributeValues: {
":id": 1,
},
Limit: 1,
ScanForwardIndex: false,
})
.promise();
break;
This code however, only returns the first added item from the database.
Changing the ScanForwardIndex: false to true doesn't change a thing.
I thought the Sort Key would sort it automatically, but it does not.
Any idea what I'm missing, or why it isn't working?
Try ScanIndexForward and I bet it'll work. You transposed the two words.

couchdb views: return all fields in doc as map

I have a doc in couchDB:
{
"id":"avc",
"type":"Property",
"username":"user1",
"password":"password1",
"server":"localhost"
}
I want to write a view that returns a map of all these fields.
The map should look like this: [{"username","user1"},{"password","password1"},{"server","localhost"}]
Here's pseudocode of what I want -
HashMap<String,String> getProperties()
{
HashMap<String,String> propMap;
if (doc.type == 'Property')
{
//read all fields in doc one by one
//get value and add field/value to the map
}
return propMap;
}
I am not sure how to do the portion that I have commented above. Please help.
Note: right now, I want to add username, password and server fields and their values in the map. However, I might keep adding more later on. I want to make sure what I do is extensible.
I considered writing a separate view function for each field. Ex: emit("username",doc.username).
But this may not be the best way to do this. Also needs updates every time I add a new field.
First of all, you have to know:
In CouchDB, you'll index documents inside a view with a key-value pair. So if you index the property username and server, you'll have the following view:
[
{"key": "user1", "value": null},
{"key": "localhost", "value": null}
]
Whenever you edit a view, it invalidates the index so Couch has to rebuild the index. If you were to add new fields to that view, that's something you have to take into account.
If you want to query multiple fields in the same query, all those fields must be in the same view. If it's not a requirement, then you could easily build an index for every field you want.
If you want to index multiple fields in the same view, you could do something like this:
// We define a map function as a function which take a single parameter: The document to index.
(doc) => {
// We iterate over a list of fields to index
["username", "password", "server"].forEach((key, value) => {
// If the document has the field to index, we index it.
if (doc.hasOwnProperty(key)) {
// map(key,value) is the function you call to index your document.
// You don't need to pass a value as you'll be able to get the macthing document by using include_docs=true
map(doc[key], null);
}
});
};
Also, note that Apache Lucene allows to make full-text search and might fit better your needs.

In sails/waterline get maximum value of a column in a database agnostic way

While using sails as ORM (version 1.0), I notice that there is a function called Model.avg (as well as sum). - However there is not a maximum or minimum function to get the maximum or minimum from a column in a model; so it seems this is not necessary because it is covered by other functions already?
Now in my database I need to get the "maximum id" in a list; and I have it working for postgresql by using a native query:
const maxnum = await Order.getDatastore().sendNativeQuery('SELECT MAX(\"orderNr\") FROM \"order\"')
While this isn't the most difficult thing, it is not what I truly want: it is limited to only sql-based datastores (so we wouldn't be able to move easily to mongodb); and the syntax might actually be even different for another sql database type.
So I wonder - can this be transformed in such a way it doesn't rely on sendNativeQuery?
You can try .query() to execute a raw SQL query using the specified model's datastore and if u want u can try pg , an NPM package used for communicating with PostgreSQL databases:
Pet.query('SELECT pet.name FROM pet WHERE pet.name = $1', [ 'dog' ]
,function(err, rawResult) {
if (err) { return res.serverError(err); }
sails.log(rawResult);
// (result format depends on the SQL query that was passed in, and
the adapter you're using)
// Then parse the raw result and do whatever you like with it.
return res.ok();
});
You can use the limit and order options waterline provides to get a single Model with a maximal value (then just extract that value).
const orderModel = await Order.find({
where: {},
select: ['orderNr'],
limit: 1,
sort: 'orderNr DESC'
});
console.log(orderModel.orderNr);
Like most things in Waterline, it's probably not as efficient as an SQL SELECT MAX query (or some equivalent in mongo, etc), but it should allow swapping out the database with no maintenance. Last note, don't forget to handle the case of no models found.

Update Array from Document (MongoDB) in Javascript not Working

I've looking for an answer for like 5 five hours straight, hope somebody can help. I have a MongoDb collection results (I'm using mLab) which looks like this:
{
"user":"5818be9c74aaec1824c28626"
"results":[{
"game_id":14578,
"level1":-1,
"level2":-1,
"level3":-1
},
{ ....
}],
{ "user":....
}
}
"user" is a MongoID I save in a previous part of the code, "results" is a record of scores. When an user does a new score, I have to update the score of the corresponding level (I'm using NodeJS).
This is one of the things I've tried so far.
app.get('/levelCompleted/:id/:time', function (request, response) {
var id = request.params.id;
var time = parseInt(request.params.time);
var u= game.getUserById(id);
var k = "results.$.level"+(u.level);
//I build the key to update dinamycally
dbM.collection("results").update(
{user:id,
"results.game_id":u.game_id
//u has its own game_id
},
{$set: {k:time}}
);
...
response.send(...);
});
I've checked the content of every variable and parameter, tried also using $elemMatch and dot notation, set upsert and multi, with no results. I've used an identical command on mongo shell and it has work on the first try.
Update with Mongo Shell
If someone could tell me what I'm doing wrong or point me in the right direction, it would be great.
Thanks
When you use a MongoId as a field in a MongoDB, you can't just pass a string with the id to do the query, you have to identify that string as an ObjectId (Id type in Mongo). Just add a new require in your node.js file.
var ObjectID = require("mongodb").ObjectID;
And use the imported constructor in your update request.
dbM.collection("results").update(
{user:ObjectID(id),...
...
}

Mongo check if a document already exists

In the MEAN app I'm currently building, the client-side makes a $http POST request to my API with a JSON array of soundcloud track data specific to that user. What I now want to achieve is for those tracks to be saved to my app database under a 'tracks' table. That way I'm then able to load tracks for that user from the database and also have the ability to create unique client URLs (/tracks/:track)
Some example data:
{
artist: "Nicole Moudaber"
artwork: "https://i1.sndcdn.com/artworks-000087731284-gevxfm-large.jpg?e76cf77"
source: "soundcloud"
stream: "https://api.soundcloud.com/tracks/162626499/stream.mp3?client_id=7d7e31b7e9ae5dc73586fcd143574550"
title: "In The MOOD - Episode 14"
}
This data is then passed to the API like so:
app.post('/tracks/add/new', function (req, res) {
var newTrack;
for (var i = 0; i < req.body.length; i++) {
newTrack = new tracksTable({
for_user: req.user._id,
title: req.body[i].title,
artist: req.body[i].artist,
artwork: req.body[i].artwork,
source: req.body[i].source,
stream: req.body[i].stream
});
tracksTable.find({'for_user': req.user._id, stream: req.body[i].stream}, function (err, trackTableData) {
if (err)
console.log('MongoDB Error: ' + err);
// stuck here - read below
});
}
});
The point at which I'm stuck, as marked above is this: I need to check if that track already exists in the database for that user, if it doesn't then save it. Then, once the loop has finished and all tracks have either been saved or ignored, a 200 response needs to be sent back to my client.
I've tried several methods so far and nothing seems to work, I've really hit a wall and so help/advice on this would be greatly appreciated.
Create a compound index and make it unique.
Using the index mentioned above will ensure that there are no documents which have the same for_user and stream.
trackSchema.ensureIndex( {for_user:1, stream:1}, {unique, true} )
Now use the mongoDB batch operation to insert multiple documents.
//docs is the array of tracks you are going to insert.
trackTable.collection.insert(docs, options, function(err,savedDocs){
//savedDocs is the array of docs saved.
//By checking savedDocs you can see how many tracks were actually inserted
})
Make sure to validate your objects as by using .collection we are bypassing mongoose.
Make a unique _id based on user and track. In mongo you can pass in the _id that you want to use.
Example {_id : "NicoleMoudaber InTheMOODEpisode14",
artist: "Nicole Moudaber"
artwork: "https://i1.sndcdn.com/artworks-000087731284-gevxfm-large.jpg?e76cf77"
source: "soundcloud"
stream: "https://api.soundcloud.com/tracks/162626499/stream.mp3? client_id=7d7e31b7e9ae5dc73586fcd143574550"
title: "In The MOOD - Episode 14"}
_id must be unique and won't let you insert another document with the same _id. You could also use this to find the record later db.collection.find({_id : NicoleMoudaber InTheMOODEpisode14})
or you could find all tracks for db.collection.find({_id : /^NicoleMoudaber/}) and it will still use the index.
There is another method to this that I can explain if you dont' like this one.
Both options will work in a sharded environment as well as a single replica set. "Unique" indexes do not work in a sharded environment.
Soundcloud API provides a track id, just use it.
then before inserting datas you make a
tracks.find({id_soundcloud : 25645456}).exec(function(err,track){
if(track.length){ console.log("do nothing")}else {//insert}
});

Categories

Resources