node-mysql timing - javascript

i have a recursive query like this (note: this is just an example):
var user = function(data)
{
this.minions = [];
this.loadMinions = function()
{
_user = this;
database.query('select * from users where owner='+data.id,function(err,result,fields)
{
for(var m in result)
{
_user.minions[result[m].id] = new user(result[m]);
_user.minions[result[m].id].loadMinions();
}
}
console.log("loaded all minions");
}
}
currentUser = new user(ID);
for (var m in currentUser.minions)
{
console.log("minion found!");
}
this don't work because the timmings are all wrong, the code don't wait for the query.
i've tried to do this:
var MyQuery = function(QueryString){
var Data;
var Done = false;
database.query(QueryString, function(err, result, fields) {
Data = result;
Done = true;
});
while(Done != true){};
return Data;
}
var user = function(data)
{
this.minions = [];
this.loadMinions = function()
{
_user = this;
result= MyQuery('select * from users where owner='+data.id);
for(var m in result)
{
_user.minions[result[m].id] = new user(result[m]);
_user.minions[result[m].id].loadMinions();
}
console.log("loaded all minions");
}
}
currentUser = new user(ID);
for (var m in currentUser.minions)
{
console.log("minion found!");
}
but he just freezes on the while, am i missing something?

The first hurdle to solving your problem is understanding that I/O in Node.js is asynchronous. Once you know how this applies to your problem the recursive part will be much easier (especially if you use a flow control library like Async or Step).
Here is an example that does some of what you're trying to do (minus the recursion). Personally, I would avoid recursively loading a possibly unknown number/depth of records like that; Instead load them on demand, like in this example:
var User = function(data) {
this.data = data
this.minions;
};
User.prototype.getMinions = function(primaryCallback) {
var that = this; // scope handle
if(this.minions) { // bypass the db query if results cached
return primaryCallback(null, this.minions);
}
// Callback invoked by database.query when it has the records
var aCallback = function(error, results, fields) {
if(error) {
return primaryCallback(error);
}
// This is where you would put your recursive minion initialization
// The problem you are going to have is callback counting, using a library
// like async or step would make this party much much easier
that.minions = results; // bypass the db query after this
primaryCallback(null, results);
}
database.query('SELECT * FROM users WHERE owner = ' + data.id, aCallback);
};
var user = new User(someData);
user.getMinions(function(error, minions) {
if(error) {
throw error;
}
// Inside the function invoked by primaryCallback(...)
minions.forEach(function(minion) {
console.log('found this minion:', minion);
});
});
The biggest thing to note in this example are the callbacks. The database.query(...) is asynchronous and you don't want to tie up the event loop waiting for it to finish. This is solved by providing a callback, aCallback, to the query, which is executed when the results are ready. Once that callback fires and after you perform whatever processing you want to do on the records you can fire the primaryCallback with the final results.

Each Node.js process is single-threaded, so the line
while(Done != true){};
takes over the thread, and the callback that would have set Done to true never gets run because the thead is blocked on an infinite loop.
You need to refactor your program so that code that depends on the results of the query is included within the callback itself. For example, make MyQuery take a callback argument:
MyQuery = function(QueryString, callback){
Then call the callback at the end of your database.query callback -- or even supply it as the database.query callback.

The freezing is unfortunately correct behaviour, as Node is single-threaded.
You need a scheduler package to fix this. Personally, I have been using Fibers-promise for this kind of issue. You might want to look at this or another promise library or at async

Related

Call hierarchy of async functions inside a loop?

There's a async call I'm making that queries a database on a service, but this service has a limit of how many it can output at once, so I need to check if it hit its limit through the result it sends and repeat the query until it doesn't.
Synchronous mockup :
var query_results = [];
var limit_hit = true; #While this is true means that the query hit the record limit
var start_from = 0; #Pagination parameter
while (limit_hit) {
Server.Query(params={start_from : start_from}, callback=function(result){
limit_hit = result.limit_hit;
start_from = result.results.length;
query_result.push(result.results);
}
}
Obviously the above does not work, I've seen some other questions here about the issue, but they don't mention what to do when you need each iteration to wait for the last one to finish and you don't know before hand the number of iterations.
How can I turn the above asynchronous? I'm open to answers using promise/deferred-like logic, but preferably something clean.
I can probably think of a monstruous and horrible way of doing this using waits/timeouts, but there has to be a clean, clever and modern way to solve it.
Another way is to make a "pre-query" to know the number of features before hand so you know the number of loops, I'm not sure if this is the correct way.
Here we use Dojo sometimes, but the examples I found does not explain what to do when you have an unknown amount of loops https://www.sitepen.com/blog/2015/06/10/dojo-faq-how-can-i-sequence-asynchronous-operations/
although many answers already, still I believe async/await is the cleanest way.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/async_function
and you might need babel
https://babeljs.io/
JS async logic syntax changed from callback to promise then to async/await, they all do the same thing, when callback nests a lot we need something like a chain, then promise come, when promise goes in loop, we need something make the chain more plain more simple, then async/await come. But not all browsers support the new syntax, so babel come to compile new syntax to old syntax, then you can always code in new syntax.
getData().then((data) => {
//do something with final data
})
async function getData() {
var query_results = [];
var limit_hit = true;
var start_from = 0;
//when you use await, handle error with try/catch
try {
while (limit_hit) {
const result = await loadPage(start_from)
limit_hit = result.limit_hit;
start_from = result.results.length;
query_result.push(result.results);
}
} catch (e) {
//when loadPage rejects
console.log(e)
return null
}
return query_result
}
async function loadPage(start_from) {
//when you use promise, handle error with reject
return new Promise((resolve, reject) => Server.Query({
start_from
}, (result, err) => {
//error reject
if (err) {
reject(err)
return
}
resolve(result)
}))
}
If you want to use a loop then I think there is no (clean) way to do it without Promises.
A different approach would be the following:
var query_results = [];
var start_from = 0;
funciton myCallback(result) {
if(!result) {
//first call
Server.Query({ start_from: start_from}, myCallback);
} else {
//repeated call
start_from = result.results.length
query_result.push(result.results);
if(!result.limit_hit) {
//limit has not been hit yet
//repeat the query with new start value
Server.Query({ start_from: start_from}, myCallback);
} else {
//call some callback function here
}
}
}
myCallback(null);
You could call this recursive, but since the Query is asynchronous you shouldn't have problems with call stack limits etc.
Using promises in an ES6 environment you could make use of async/await. Im not sure if this is possible with dojo.
You don't understand callbacks until you have written a rate limiter or queue ;) The trick is to use a counter: Increment the counter before the async request, and decrement it when you get the response, then you will know how many requests are "in flight".
If the server is choked you want to put the item back in the queue.
There are many things you need to take into account:
What will happen to the queue if the process is killed ?
How long to wait before sending another request ?
Make sure the callback is not called many times !
How many times should you retry ?
How long to wait before giving up ?
Make sure there are no loose ends ! (callback is never called)
When all edge cases are taken into account you will have a rather long and not so elegant solution. But you can abstract it into one function! (that returns a Promise or whatever you fancy).
If you have a user interface you also want to show a loading bar and some statistics!
You must await for the server response every time. Here a encapsulated method
var query = (function(){
var results = [];
var count = 0;
return function check(fun){
Server.Query({ start_from: count}, function(d){
count = d.results.length;
results.push(d.results);
if (d.limit_hit && fun) fun(results);
else check(fun);
});
};
})();
// Call here
var my_query = query(function(d){
// --> retrive all data when limit_hit is true)
});
You can use a generator function Generators to achieve this
For POC:
some basics
- You define a generator with an asterick *
- it exposes a next function which returns the next value
- generators can pause with yield statement internally and can resume externally by calling the next()
- While (true) will ensure that the generator is not done until limit has reached
function *limitQueries() {
let limit_hit = false;
let start_from = 0;
const query_result = [];
while (true) {
if (limit_hit) {break;}
yield Server.Query(params={start_from : start_from},
callback=function* (result) {
limit_hit = result.limit_hit;
start_from = result.results.length;
yield query_result.push(result.results);
}
}
}
So apparently, the generator function maintains its own state. Generator function exposes two properties { value, done } and you can call it like this
const gen = limitQueries();
let results = [];
let next = gen.next();
while(next.done) {
next = gen.next();
}
results = next.value;
You might have to touch your Server.Query method to handle generator callback. Hope this helps! Cheers!

node.js: structure multiple API requests, work on them and combine them

currently I am struggeling a little bit with node.js (I am new to it) doing different API requests (Usabilla API), work on the results and then combine them in order to work on the whole set (e.g. export).
Requesting the API is not the problem but I can't get the results out to do some other stuff on it (asynchronous code drives me crazy).
Attached please find a overview how I thought to do this. Maybe I am totally wrong about this or maybe you have other more elegant suggestions.
My code works until I have to request the two different API "adresses" (they are provided) and then extract the results to do some other stuff.
My problem here is that there are nested functions with a promise and I cant figure out how to pass this through the parent function inside waterfall to get handled by the next function.
In the code, of course there is nothing parallel as shown in the diagram.
Thats another point, how to do that ? Simply nest parallel and series/ another waterfall inside waterfall ?
I am a little bit confused because that gets more and more complex for a simple problem when this would be done with synchronous code.
Here I build up all my request querys (at the moment 4):
function buildQuery(IDs,callback){
var i = 0;
var max = Object.keys(IDs).length;
async.whilst(
function(){return i < max},
function(callback){
FeedbackQuery[i] =
{
identifier: IDs[i].identifier,
query:
{id: IDs[i].id,
params: {since:sinceDate,}
}
};
i++;
callback(null,i);
})
console.log(FeedbackQuery);
callback (null,FeedbackQuery);
};
I then have to decide which type of query it is and add it to an object which should contain all the items of this identifier type:
function FeedbackRequest(FeedbackQuery,callback)
{
var i = 0;
var max = Object.keys(FeedbackQuery).length;
async.whilst(
function(){return i < max},
function (callback){
identifier = FeedbackQuery[i].identifier;
APIquery = FeedbackQuery[i].query;
switch(identifier)
{
case 'mobilePortal':
console.log(FeedbackQuery[i].identifier, 'aktiviert!');
var result = api.websites.buttons.feedback.get(APIquery);
result.then(function(feedback)
{
var item = Object.keys(feedbackResults).length;
feedbackResultsA[item] = feedback;
callback(null, feedbackResultsA);
})
break;
case 'apps':
console.log(FeedbackQuery[i].identifier, 'aktiviert!');
var result = api.apps.forms.feedback.get(APIquery);
result.then(function(feedback)
{
var item = Object.keys(feedbackResults).length;
feedbackResultsB[item] = feedback;
callback(null, feedbackResultsB);
})
break;
}
i++;
callback(null,i);
})
};
Currently the functions are bundled in an async waterfall:
async.waterfall([
async.apply(buildQuery,IDs2request),
FeedbackRequest,
// a function to do something on the whole feedbackResults array
],function (err, result) {
// result now equals 'done'
if (err) { console.log('Something is wrong!'); }
return console.log('Done!');
})
How it actually should be:
Structure
Thank you very much for any tips or hints!
I'm not proficient with async, and I believe if you'r new to this, it's harder than a simple Promise library like bluebird combine with lodash for helpers.
What I would do based on your schemas :
var firstStepRequests = [];
firstStepRequests.push(buildQuery());// construct your first steps queries, can be a loop, goal is to have firstStepRequests to be an array of promise.
Promise.all(firstStepRequests)
.then((allResults) => {
var type1 = _.filter(allResults, 'request_type_1');
var type2 = _.filter(allResults, 'request_type_2');
return {
type1: type1,
type2: type2
};
})
.then((result) => {
result.type1 = //do some work
result.type2 = //do some work
return result;
})
.then((result) => {
//export or merge or whatever.
});
Goal is to have a simple state machine.
UPDATE
If you want to keep identifier for a request, you can use props to have :
var props = {
id_1:Promise,
id_2:Promise,
id_3:Promise
};
Promise.props(props).then((results) => {
// results is {
id_1:result of promise,
id_2:result of promise,
etc...
}
})
You could do something like :
var type1Promises = getType1Requests(); //array of type 1
var type2Promises = getType2Requests(); // array of type 2
var props = {
type_1: Promise.all(type1Promises),
type_2: Promise.all(type2Promises)
}
Promise.props(props).then((result) => {
//result is : {
type_1: array of result of promises of type 1
type_2: array of result of promises of type 2
}
})

This code doesn't seem to fire in order?

My problem is that the code does not seem to be running in order, as seen below.
This code is for my discord.js bot that I am creating.
var Discord = require("discord.js");
var bot = new Discord.Client();
var yt = require("C:/Users/username/Documents/Coding/Discord/youtubetest.js");
var youtubetest = new yt();
var fs = require('fs');
var youtubedl = require('youtube-dl');
var prefix = "!";
var vidid;
var commands = {
play: {
name: "!play ",
fnc: "Gets a Youtube video matching given tags.",
process: function(msg, query) {
youtubetest.respond(query, msg);
var vidid = youtubetest.vidid;
console.log(typeof(vidid) + " + " + vidid);
console.log("3");
}
}
};
bot.on('ready', () => {
console.log('I am ready!');
});
bot.on("message", msg => {
if(!msg.content.startsWith(prefix) || msg.author.bot || (msg.author.id === bot.user.id)) return;
var cmdraw = msg.content.split(" ")[0].substring(1).toLowerCase();
var query = msg.content.split("!")[1];
var cmd = commands[cmdraw];
if (cmd) {
var res = cmd.process(msg, query, bot);
if (res) {
msg.channel.sendMessage(res);
}
} else {
let msgs = [];
msgs.push(msg.content + " is not a valid command.");
msgs.push(" ");
msgs.push("Available commands:");
msgs.push(" ");
msg.channel.sendMessage(msgs);
msg.channel.sendMessage(commands.help.process(msg));
}
});
bot.on('error', e => { console.error(e); });
bot.login("mytoken");
The youtubetest.js file:
var youtube_node = require('youtube-node');
var ConfigFile = require("C:/Users/username/Documents/Coding/Discord/json_config.json");
var mybot = require("C:/Users/username/Documents/Coding/Discord/mybot.js");
function myyt () {
this.youtube = new youtube_node();
this.youtube.setKey(ConfigFile.youtube_api_key);
this.vidid = "";
}
myyt.prototype.respond = function(query, msg) {
this.youtube.search(query, 1, function(error, result) {
if (error) {
msg.channel.sendMessage("There was an error finding requested video.");
} else {
vidid = 'http://www.youtube.com/watch?v=' + result.items[0].id.videoId;
myyt.vidid = vidid;
console.log("1");
}
});
console.log("2");
};
module.exports = myyt;
As the code shows, i have an object for the commands that the bot will be able to process, and I have a function to run said commands when a message is received.
Throughout the code you can see that I have put three console.logs with 1, 2 and 3 showing in which order I expect the parts of the code to run. When the code is run and a query is found the output is this:
I am ready!
string +
2
3
1
This shows that the code is running in the wrong order that I expect it to.
All help is very highly appreciated :)
*Update! Thank you all very much to understand why it isn't working. I found a solution where in the main file at vidid = youtubetest.respond(query, msg) when it does that the variable is not assigned until the function is done so it goes onto the rest of my code without the variable. To fix I simply put an if statement checking if the variable if undefined and waiting until it is defined.*
Like is mentioned before, a lot of stuff in javascript runs in async, hence the callback handlers. The reason it runs in async, is to avoid the rest of your code being "blocked" by remote calls. To avoid ending up in callback hell, most of us Javascript developers are moving more and more over to Promises. So your code could then look more like this:
myyt.prototype.respond = function(query, msg) {
return new Promise(function(resolve, reject) {
this.youtube.search(query, 1, function(error, result) {
if (error) {
reject("There was an error finding requested video."); // passed down to the ".catch" statement below
} else {
vidid = 'http://www.youtube.com/watch?v=' + result.items[0].id.videoId;
myyt.vidid = vidid;
console.log("1");
resolve(2); // Resolve marks the promises as successfully completed, and passes along to the ".then" method
}
});
}).then(function(two) {
// video is now the same as myyt.vidid as above.
console.log(two);
}).catch(function(err) {
// err contains the error object from above
msg.channel.sendMessage(err);
})
};
This would naturally require a change in anything that uses this process, but creating your own prototypes seems.. odd.
This promise returns the vidid, so you'd then set vidid = youtubetest.response(query, msg);, and whenever that function gets called, you do:
vidid.then(function(id) {
// id is now the vidid.
});
Javascript runs async by design, and trying to hack your way around that leads you to dark places fast. As far as I can tell, you're also targetting nodeJS, which means that once you start running something synchronously, you'll kill off performance for other users, as everyone has to wait for that sync call to finish.
Some suggested reading:
http://callbackhell.com/
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise
https://stackoverflow.com/a/11233849/3646975
I'd also suggest looking up ES6 syntax, as it shortens your code and makes life a hellofalot easier (native promises were only introduced in ES6, which NodeJS 4 and above supports (more or less))
In javascript, please remember that any callback function you pass to some other function is called asynchronously. I.e. the calls to callback function may not happen "in order". "In order" in this case means the order they appear on the source file.
The callback function is simply called on certain event:
When there is data to be processed
on error
in your case for example when the youtube search results are ready,
'ready' event is received or 'message' is received.
etc.

Asynchronously Write Large Array of Objects to Redis with Node.js

I created a Node.js script that creates a large array of randomly generated test data and I want to write it to a Redis DB. I am using the redis client library and the async library. Initially, I tried executing a redisClient.hset(...) command within the for loop that generates my test data, but after some Googling, I learned the Redis method is asynchronous while the for loop is synchronous. After seeing some questions on StackOverflow, I can't get it to work the way I want.
I can write to Redis without a problem with a small array or larger, such as one with 100,000 items. However, it does not work well when I have an array of 5,000,000 items. I end up not having enough memory because the redis commands seem to be queueing up, but aren't executed until after async.each(...) is complete and the node process does not exit. How do I get the Redis client to actually execute the commands, as I call redisClient.hset(...)?
Here a fragment of the code I am working with.
var redis = require('redis');
var async = require('async');
var redisClient = redis.createClient(6379, '192.168.1.150');
var testData = generateTestData();
async.each(testData, function(item, callback) {
var someData = JSON.stringify(item.data);
redisClient.hset('item:'+item.key, 'hashKey', someData, function(err, reply) {
console.log("Item was persisted. Result: " +reply);
});
callback();
}, function(err) {
if (err) {
console.error(err);
} else {
console.log.info("Items have been persisted to Redis.");
}
});
You could call eachLimit to ensure you are not executing too many redisClient.hset calls at the same time.
To avoid overflowing the call stack you could do setTimeout(callback, 0); instead of calling the callback directly.
edit:
Forget what I said about setTimeout. All you need to do is call the callback at the right place. Like so:
redisClient.hset('item:'+item.key, 'hashKey', someData, function(err, reply) {
console.log("Item was persisted. Result: " +reply);
callback();
});
You may still want to use eachLimit and try out which limit works best.
By the way - async.each is supposed to be used only on code that schedules the invocation of the callback in the javascript event queue (e.g. timer, network, etc) . Never use it on code that calls the callback immediately as was the case in your original code.
edit:
You can implement your own eachLimit function that instead of an array takes a generator as it's first argument. Then you write a generator function to create the test data. For that to work, node needs to be run with "node --harmony code.js".
function eachLimit(generator, limit, iterator, callback) {
var isError = false, j;
function startNextSetOfActions() {
var elems = [];
for(var i = 0; i < limit; i++) {
j = generator.next();
if(j.done) break;
elems.push(j.value);
}
var activeActions = elems.length;
if(activeActions === 0) {
callback(null);
}
elems.forEach(function(elem) {
iterator(elem, function(err) {
if(isError) return;
else if(err) {
callback(err);
isError = true;
return;
}
activeActions--;
if(activeActions === 0) startNextSetOfActions();
});
});
}
startNextSetOfActions();
}
function* testData() {
while(...) {
yield new Data(...);
}
}
eachLimit(testData(), 10, function(item, callback) {
var someData = JSON.stringify(item.data);
redisClient.hset('item:'+item.key, 'hashKey', someData, function(err, reply) {
if(err) callback(err);
else {
console.log("Item was persisted. Result: " +reply);
callback();
}
});
}, function(err) {
if (err) {
console.error(err);
} else {
console.log.info("Items have been persisted to Redis.");
}
});

Node module: Don't return until all async requests have finished

I'm new to node and am having trouble understanding node's async behavior. I know this is a very frequently addressed question on SO, but I simply can't understand how to get any of the solutions I've read to work in my context.
I'm writing this module which I want to return an object containing various data.
var myModule = (function () {
var file,
fileArray,
items = [],
getBlock = function (fileArray) {
//get the data from the file that I want, return object
return block;
},
parseBlock = function (block) {
//[find various items in the block, put them into an "myItems" object, then
//take the items and do a look up against a web api as below]...
for (var i = 0, l = myItems.length; i < l; i ++) {
(function (i) {
needle.post(MY_URL, qstring, function(err, resp, body){
if (!err && resp.statusCode === 200){
myItems[i].info = body;
if (i === (myItems.length -1)) {
return myItems;
}
}
});
})(i);
}
},
getSomeOtherData = function (fileArray) {
//parse some other data from the file
}
return {
setFile: function (file) {
fileArray = fs.readFileSync(file).toString().split('\n');
},
render: function () {
var results = [];
results.someOtherData = getsomeOtherData();
var d = getBlock();
results.items = parseBlock(d);
return results;
}
}
})();
When I call this module using:
myModule.setFile('myFile.txt');
var res = myModule.render();
the variable res has the values from the someOtherData property, but not the items property. I understand that my long-running http request has not completed and that node just zooms ahead and finishes executing, but that's not what I want. I looked at a bunch of SO questions on this, and looked at using Q or queue-async, but with no success.
How do I get this module to return no data until all requests have completed? Or is that even possible in node? Is there a better way to design this to achieve my goal?
The problem in your example is your calling getBlock() but you have declared your function as getBlockData(). So you will not get a result. Try changing it to both the same.
Presuming that you have them both the same, your next problem is that your processing data from a file, so I presume that your reading the contents of the file and then parsing it.
If this is the case then there are sync reads that you can use to force sync, however I wouldn't recommend this.
You really want to structure your program based on events. Your thinking in the paradigm of 'call a function, when it returns continue'. You need to be thinking more along the lines of 'call a process and add a listener, the listener then does reply handling'.
This works very well for comms. You receive a request. You need to reply based on contents of file. So you start the read process with two possible results. It calls the completed function or the error function. Both would then call the reply function to process how to handle a reply for the request.
It's important not to block as you will be blocking the thread via which all processes are handled.
Hope that helps, if not add some comments and I will try and elaborate.
Have a look at this answer to another question to see a good example of processing a file using the standard listeners. All async calls have a listener concept for what can happen. All you need to do is pass a function name (or anon if you prefer) to them when you call them.
A quick example (based on node.js stream.Readable API:
fs.createReadStream(filename, {
'flags': 'r'
}).addListener( "data", function(chunk) {
// do your processing logic
}).addListener( "end", function(chunk) {
// do your end logic
response(...);
}).addListener( "error", function(chunk) {
// do your error logic
response(...);
}).addListener( "close",function() {
// do your close logic
});
function response(info) {
}

Categories

Resources