Async operations inside an telemetry processor with Application Insights NodeJs - javascript

What I would like to do is add custom properties to telemetry data as it leaves my application. Currently I am achieving this using a Telemetry Processor, however ideally I would like to read the value to be sent with the event from a database.
Is it possible to perform async operations inside a telemetry processor?
var TraceProcessor = function (app) {
return function (envelope) {
var i;
var objTelemetryController = app.telemetryController;
objTelemetryController.__proto__.getActiveTraces('GLOBAL', function (err, objTraces) {
if (err) {
// Error controller log error
return;
}
if (objTraces) {
for (i = 0; i < objTraces.length; i++) {
envelope.data.baseData.properties['TraceProperty'] = objTraces[i];
}
return true;
}
});
};
};
module.exports = TraceProcessor;
Using his code the telemetry data is not sent because insights requires true to be returned from any telemetry processors that are in use. Obviously this does happen eventually but not so that the properties can be added.

I think it's better to use TelemetryInitializer to enrich the telemetry data with extra information, the purpose of the TelemetryProcessor skewed more towards filtering rather than data enrichment.
However, I think that if you try to call SQL or HTTP dependency from within the Telemetry Initializer it might go into an endless cycle:
Telemetry Item is processed in Initializer
Initializer starts SQL query AI
Detects SQL query and start processing telemetry item about it
Telemetry Initializer calls into SQL....
I doubt that async is really supported here at this moment, it could've helped (e.g. return a task and wait for value to fill in) but it would require an immersive investigation to consider all the cases.

Related

Cant read data from collection in MongoDB Atlas Trigger

New to MongoDB, very new to Atlas. I'm trying to set up a trigger such that it reads all the data from a collection named Config. This is my attempt:
exports = function(changeEvent) {
const mongodb = context.services.get("Cluster0");
const db = mongodb.db("TestDB");
var collection = db.collection("Config");
config_docs = collection.find().toArray();
console.log(JSON.stringify(config_docs));
}
the function is part of an automatically created realm application called Triggers_RealmApp, which has Cluster0 as a named linked data source. When I go into Collections in Cluster0, TestDB.Config is one of the collections.
Some notes:
it's not throwing an error, but simply returning {}.
When I change context.services.get("Cluster0"); to something else, it throws an error
When I change "TestDB" to a db that doesnt exist, or "Config" to a collection which doesn't exist, I get the same output; {}
I've tried creating new Realm apps, manually creating services, creating new databases and new collections, etc. I keep bumping into the same issue.
The mongo docs reference promises and awaits, which I haven't seen in any examples (link). I tried experimenting with that a bit and got nowhere. From what I can tell, what I've already done is the typical way of doing it.
Images:
Collection:
Linked Data Source:
I ended up taking it up with MongoDB directly, .find() is asynchronous and I was handling it incorrectly. Here is the reply straight from the horses mouth:
As I understand it, you are not getting your expected results from the query you posted above. I know it can be confusing when you are just starting out with a new technology and can't get something to work!
The issue is that the collection.find() function is an asynchronous function. That means it sends out the request but does not wait for the reply before continuing. Instead, it returns a Promise, which is an object that describes the current status of the operation. Since a Promise really isn't an array, your statment collection.find().toArray() is returning an empty object. You write this empty object to the console.log and end your function, probably before the asynchronous call even returns with your data.
There are a couple of ways to deal with this. The first is to make your function an async function and use the await operator to tell your function to wait for the collection.find() function to return before continuing.
exports = async function(changeEvent) {
const mongodb = context.services.get("Cluster0");
const db = mongodb.db("TestDB");
var collection = db.collection("Config");
config_docs = await collection.find().toArray();
console.log(JSON.stringify(config_docs));
};
Notice the async keyword on the first line, and the await keyword on the second to last line.
The second method is to use the .then function to process the results when they return:
exports = function(changeEvent) {
const mongodb = context.services.get("Cluster0");
const db = mongodb.db("TestDB");
var collection = db.collection("Config");
collection.find().toArray().then(config_docs => {
console.log(JSON.stringify(config_docs));
});
};
The connection has to be a connection to the primary replica set and the user log in credentials are of a admin level user (needs to have a permission of cluster admin)

In meteor, can pub/sub be used for arbitrary in-memory objects (not mongo collection)

I want to establish a two-way (bidirectional) communication within my meteor app. But I need to do it without using mongo collections.
So can pub/sub be used for arbitrary in-memory objects?
Is there a better, faster, or lower-level way? Performance is my top concern.
Thanks.
Yes, pub/sub can be used for arbitrary objects. Meteor’s docs even provide an example:
// server: publish the current size of a collection
Meteor.publish("counts-by-room", function (roomId) {
var self = this;
check(roomId, String);
var count = 0;
var initializing = true;
// observeChanges only returns after the initial `added` callbacks
// have run. Until then, we don't want to send a lot of
// `self.changed()` messages - hence tracking the
// `initializing` state.
var handle = Messages.find({roomId: roomId}).observeChanges({
added: function (id) {
count++;
if (!initializing)
self.changed("counts", roomId, {count: count});
},
removed: function (id) {
count--;
self.changed("counts", roomId, {count: count});
}
// don't care about changed
});
// Instead, we'll send one `self.added()` message right after
// observeChanges has returned, and mark the subscription as
// ready.
initializing = false;
self.added("counts", roomId, {count: count});
self.ready();
// Stop observing the cursor when client unsubs.
// Stopping a subscription automatically takes
// care of sending the client any removed messages.
self.onStop(function () {
handle.stop();
});
});
// client: declare collection to hold count object
Counts = new Mongo.Collection("counts");
// client: subscribe to the count for the current room
Tracker.autorun(function () {
Meteor.subscribe("counts-by-room", Session.get("roomId"));
});
// client: use the new collection
console.log("Current room has " +
Counts.findOne(Session.get("roomId")).count +
" messages.");
In this example, counts-by-room is publishing an arbitrary object created from data returned from Messages.find(), but you could just as easily get your source data elsewhere and publish it in the same way. You just need to provide the same added and removed callbacks like the example here.
You’ll notice that on the client there’s a collection called counts, but this is purely in-memory on the client; it’s not saved in MongoDB. I think this is necessary to use pub/sub.
If you want to avoid even an in-memory-only collection, you should look at Meteor.call. You could create a Meteor.method like getCountsByRoom(roomId) and call it from the client like Meteor.call('getCountsByRoom', 123) and the method will execute on the server and return its response. This is more the traditional Ajax way of doing things, and you lose all of Meteor’s reactivity.
Just to add another easy solution. You can pass connection: null to your Collection instantiation on your server. Even though this is not well-documented, but I heard from the meteor folks that this makes the collection in-memory.
Here's an example code posted by Emily Stark a year ago:
if (Meteor.isClient) {
Test = new Meteor.Collection("test");
Meteor.subscribe("testsub");
}
if (Meteor.isServer) {
Test = new Meteor.Collection("test", { connection: null });
Meteor.publish("testsub", function () {
return Test.find();
});
Test.insert({ foo: "bar" });
Test.insert({ foo: "baz" });
}
Edit
This should go under comment but I found it could be too long for it so I post as an answer. Or perhaps I misunderstood your question?
I wonder why you are against mongo. I somehow find it a good match with Meteor.
Anyway, everyone's use case can be different and your idea is doable but not with some serious hacks.
if you look at Meteor source code, you can find tools/run-mongo.js, it's where Meteor talks to mongo, you may tweak or implement your adaptor to work with your in-memory objects.
Another approach I can think of, will be to wrap your in-memory objects and write a database logic/layer to intercept existing mongo database communications (default port on 27017), you have to take care of all system environment variables like MONGO_URL etc. to make it work properly.
Final approach is wait until Meteor officially supports other databases like Redis.
Hope this helps.

Deps autorun in Meteor JS

Decided to test out Meteor JS today to see if I would be interested in building my next project with it and decided to start out with the Deps library.
To get something up extremely quick to test this feature out, I am using the 500px API to simulate changes. After reading through the docs quickly, I thought I would have a working example of it on my local box.
The function seems to only autorun once which is not how it is suppose to be working based on my initial understanding of this feature in Meteor.
Any advice would be greatly appreciated. Thanks in advance.
if (Meteor.isClient) {
var Api500px = {
dep: new Deps.Dependency,
get: function () {
this.dep.depend();
return Session.get('photos');
},
set: function (res) {
Session.set('photos', res.data.photos);
this.dep.changed();
}
};
Deps.autorun(function () {
Api500px.get();
Meteor.call('fetchPhotos', function (err, res) {
if (!err) Api500px.set(res);
else console.log(err);
});
});
Template.photos.photos = function () {
return Api500px.get();
};
}
if (Meteor.isServer) {
Meteor.methods({
fetchPhotos: function () {
var url = 'https://api.500px.com/v1/photos';
return HTTP.call('GET', url, {
params: {
consumer_key: 'my_consumer_key_here',
feature: 'fresh_today',
image_size: 2,
rpp: 24
}
});
}
});
}
Welcome to Meteor! A couple of things to point out before the actual answer...
Session variables have reactivity built in, so you don't need to use the Deps package to add Deps.Dependency properties when you're using them. This isn't to suggest you shouldn't roll your own reactive objects like this, but if you do so then its get and set functions should return and update a normal javascript property of the object (like value, for example), rather than a Session variable, with the reactivity being provided by the depend and changed methods of the dep property. The alternative would be to just use the Session variables directly and not bother with the Api500px object at all.
It's not clear to me what you're trying to achieve reactively here - apologies if it should be. Are you intending to repeatedly run fetchPhotos in an infinite loop, such that every time a result is returned the function gets called again? If so, it's really not the best way to do things - it would be much better to subscribe to a server publication (using Meteor.subscribe and Meteor.publish), get this publication function to run the API call with whatever the required regularity, and then publish the results to the client. That would dramatically reduce client-server communication with the same net result.
Having said all that, why would it only be running once? The two possible explanations that spring to mind would be that an error is being returned (and thus Api500px.set is never called), or the fact that a Session.set call doesn't actually fire a dependency changed event if the new value is the same as the existing value. However, in the latter case I would still expect your function to run repeatedly as you have your own depend and changed structure surrounding the Session variable, which does not implement that self-limiting logic, so having Api500px.get in the autorun should mean that it reruns when Api500px.set returns even if the Session.set inside it isn't actually doing anything. If it's not the former diagnosis then I'd just log everything in sight and the answer should present itself.

What is the recommended way to drop indexes using Mongoose?

I need to create several deployment scripts like data migration and fixtures for a MongoDB database and I couldn't find enough information about how to drop indexes using Mongoose API. This is pretty straight-forward when using the official MongoDB API:
To delete all indexes on the specified collection:
db.collection.dropIndexes();
However, I would like to use Mongoose for this and I tried to use executeDbCommand adapted from this post, but with no success:
mongoose.connection.db.executeDbCommand({ dropIndexes: collectionName, index: '*' },
function(err, result) { /* ... */ });
Should I use the official MongoDB API for Node.js or I just missed something in this approach?
To do this via the Mongoose model for the collection, you can call dropAllIndexes of the native collection:
MyModel.collection.dropAllIndexes(function (err, results) {
// Handle errors
});
Update
dropAllIndexes is deprecated in the 2.x version of the native driver, so dropIndexes should be used instead:
MyModel.collection.dropIndexes(function (err, results) {
// Handle errors
});
If you want to maintain your indexes in your schema definitions with mongoose (you probably do if you're using mongoose), you can easily drop ones not in use anymore and create indexes that don't exist yet. You can just run a one off await YourModel.syncIndexes() on any models that you need to sync. It will create ones in the background with .ensureIndexes and drop any that no longer exist in your schema definition. You can look at the full docs here:
https://mongoosejs.com/docs/api.html#model_Model.syncIndexes
It looks like you're attempting to drop all of the indexes on a given collection.
According to the MongoDB Docs, this is the correct command.
... I tried to use executeDbCommand adapted from this post, but with no success:
To really help here, we need more details:
What failed? How did you measure "no success"?
Can you confirm 100% that the command ran? Did you output to the logs in the callback? Did you check the err variable?
Where are you creating indexes? Can you confirm that you're not re-creating them after dropping?
Have you tried the command while listing specific index names? Honestly, you should not be using "*". You should be deleting and creating very specific indexes.
This might not be the best place to post this, but I think its worth posting anyway.
I call model.syncIndexes() every time a model is defined/created against the db connection, this ensures the indexes are current and up-to-date with the schema, however as it has been highlighted online (example), this can create issues in distributed architectures, where multiple servers are attempting the same operation at the same time. This is particularly relevant if using something like the cluster library to spawn master/slave instances on multiple cores on the same machine, since they often boot up in close proximity to each other when the whole server is started.
In reference to the above 'codebarbarian' article, the issue is highlighted clearly when they state:
Mongoose does not call syncIndexes() for you, you're responsible for
calling syncIndexes() on your own. There are several reasons for this,
most notably that syncIndexes() doesn't do any sort of distributed
locking. If you have multiple servers that call syncIndexes() when
they start, you might get errors due to trying to drop an index that
no longer exists.
So What I do is create a function which uses redis and redis redlock to gain a lease for some nominal period of time to prevent multiple workers (and indeed multiple workers in multiple servers) from attempting the same sync operation at the same time.
It also bypasses the whole thing unless it is the 'master' that is trying to perform the operation, I don't see any real point in delegating this job to any of the workers.
const cluster = require('cluster');
const {logger} = require("$/src/logger");
const {
redlock,
LockError
} = require("$/src/services/redis");
const mongoose = require('mongoose');
// Check is mongoose model,
// ref: https://stackoverflow.com/a/56815793/1834057
const isMongoModel = (obj) => {
return obj.hasOwnProperty('schema') && obj.schema instanceof mongoose.Schema;
}
const syncIndexesWithRedlock = (model,duration=60000) => new Promise(resolve => {
// Ensure the cluster is master
if(!cluster.isMaster)
return resolve(false)
// Now attempt to gain redlock and sync indexes
try {
// Typecheck
if(!model || !isMongoModel(model))
throw new Error('model argument is required and must be a mongoose model');
if(isNaN(duration) || duration <= 0)
throw new Error('duration argument is required, and must be positive numeric')
// Extract name
let name = model.collection.collectionName;
// Define the redlock resource
let resource = `syncIndexes/${name}`;
// Coerce Duration to Integer
// Not sure if this is strictly required, but wtf.
// Will ensure the duration is at least 1ms, given that duration <= 0 throws error above
let redlockLeaseDuration = Math.ceil(duration);
// Attempt to gain lock and sync indexes
redlock.lock(resource,redlockLeaseDuration)
.then(() => {
// Sync Indexes
model.syncIndexes();
// Success
resolve(true);
})
.catch(err => {
// Report Lock Error
if(err instanceof LockError){
logger.error(`Redlock LockError -- ${err.message}`);
// Report Other Errors
}else{
logger.error(err.message);
}
// Fail, Either LockError error or some other error
return resolve(false);
})
// General Fail for whatever reason
}catch(err){
logger.error(err.message);
return resolve(false);
}
});
I wont go into setting up Redis connection, that is the subject of some other thread, but the point of this above code is to show how you can use syncIndexes() reliably and prevent issues with one thread dropping an index and another trying to drop the same index, or other distributed issues with attempting to modify indexes concurrently.
to drop a particular index you could use
db.users.dropIndex("your_index_name_here")

Node.js global variable property is purged

my problem is not about "memory leakage", but about "memory purge" of node.js (expressjs) app.
My app should maintain some objects in memory for the fast look-up's during the service. For the time being (one or two days) after starting the app, everthing seemed fine, until suddenly my web client failed to look-up the object bacause it has been purged (undefined). I suspect Javascript GC (garbage collection). However, as you can see in the psedu-code, I assigned the objects to the node.js "global" variable properties to prevent GC from purging them. Please give me some clue what caused this problem.
Thanks much in advance for your kind advices~
My node.js environments are node.js 0.6.12, expressjs 2.5.8, and VMWare cloudfoundry node hosting.
Here is my app.js pseudo-code :
var express = require("express");
var app = module.exports = express.createServer();
// myMethods holds a set of methods to be used for handling raw data.
var myMethods = require("myMethods");
// creates node.js global properties referencing objects to prevent GC from purging them
global.myMethods = myMethods();
global.myObjects = {};
// omited the express configurations
// creates objects (data1, data2) inside the global.myObjects for the user by id.
app.post("/createData/:id", function(req, res) {
// creates an empty object for the user.
var myObject = global.myObjects[req.prams.id] = {};
// gets json data.
var data1 = JSON.parse(req.body.data1);
var data2 = JSON.parse(req.body.data2);
// buildData1 & buildData2 functions transform data1 & data2 into the usable objects.
// these functions return the references to the transformed objects.
myObject.data1 = global.myMethods.buildData1(data1);
myObject.data2 = global.myMethods.buildData2(data2);
res.send("Created new data", 200);
res.redirect("/");
});
// returns the data1 of the user.
// Problem occurs here : myObject becomes "undefined" after one or two days running the service.
app.get("/getData1/:id", function(req, res) {
var myObject = global.myObjects[req.params.id];
if (myObject !== undefined) {
res.json(myObject.data1);
} else {
res.send(500);
}
});
// omited other service callback functions.
// VMWare cloudfoundry node.js hosting.
app.listen(process.env.VCAP_APP_PORT || 3000);
Any kind of cache system (whether is roll-your-own or a third party product) should account for this scenario. You should not rely on the data always being available on an in-memory cache. There are way too many things that can cause in-memory data to be gone (machine restart, process restart, et cetera.)
In your case, you might need to update your code to see if the data is in cache. If it is not in cache then fetch it from a persistent storage (a database, a file), cache it, and continue.
Exactly like Haesung I wanted to keep my program simple, without database. And like Haesung my first experience with Node.js (and express) was to observe this weird purging. Although I was confused, I really didn't accept that I needed a storage solution to manage a json file with a couple of hundred lines. The light bulb moment for me was when I read this
If you want to have a module execute code multiple times, then export a function, and call that function.
which is taken from http://nodejs.org/api/modules.html#modules_caching. So my code inside the required file changed from this
var foo = [{"some":"stuff"}];
export.foo;
to that
export.foo = function (bar) {
var foo = [{"some":"stuff"}];
return foo.bar;
}
And then it worked fine :-)
Then I suggest to use file system, I think 4KB overhead is not a big deal for your goals and hardware. If you familiar with front-end javascript, this could be helpful https://github.com/coolaj86/node-localStorage

Categories

Resources