Need help with riak-js - javascript

I'm a newbie with node.js and riak, trying to use riak-js. I wrote the following coffeescript, to create N entries with the squares of integers 1..N. The script works fine for N=10. If I put a console.log() callback in the db.get() I can print the squares of 1..10.
db = require('riak-js').getClient({debug:false})
N = 10
for i in [1..N]
db.save('Square', String(i), String(i*i))
for i in [1..N]
db.get('Square', String(i))
My problem is that when I put N=1000 it takes about 10 seconds for my script to complete. Is this normal? I was expecting something well under 1 sec. I have a single riak node on my local machine, an Acer Aspire 5740, i3 CPU and 4GB RAM, with Ubuntu 10.04. For a RAM-only store, I have set storage_backend in $RIAK/rel/riak/etc/app.config to riak_kv_ets_backend. The riak-admin status command confirms this setting.
Q1: Perhaps riak-js is setting some default disk-based backend for my bucket? How do I find out/override this?
Q2: I don't think it's a node.js issue, but am I doing something wrong in asynchronous usage?

A1: riak-js does not use any hidden setting, it is up to you to configure your Riak nodes.
A2: Your script seems fine, there's nothing you're doing wrong.
The truth is I haven't started benchmarking or seriously considering performance issues.
That said, every request is queued internally and issued serially. It makes the API simpler and you don't run into race conditions, but it has its limitations. Ideally I want to build a wrapper around riak-js that will take care of:
Holding several instances to make requests in parallel
Automatically reconnecting to other nodes in the cluster when one goes down
Your example runs in ~5sec on my MBP (using Bitcask).
=> time coffee test.coffee
real 0m5.181s
user 0m1.245s
sys 0m0.369s
Just as a proof of concept, take a look at this:
dbs = [require('riak-js').getClient({debug: false}), require('riak-js').getClient({debug: false})]
N = 1000
for i in [1..N]
db = dbs[i % 2]
db.save('sq', String(i), String(i*i))
for i in [1..N]
db = dbs[i % 2]
db.get('sq', String(i))
Results:
=> time coffee test.coffee
real 0m3.341s
user 0m1.133s
sys 0m0.319s
This will improve by using more clients hitting the DB.
Otherwise the answer is the Protocol Buffers interface, no doubt about it. I couldn't get it running with your example so I'll have to dig into it. But that should be lightning fast.
Make sure you're running the latest Riak (there have been many performance improvements). Also take into account a little overhead for CoffeeScript compilation.

Here is my test file:
db = require('../lib').getClient({debug:false})
N = if process.argv[2] then process.argv[2] else 10
for i in [1..N]
db.save('Square', String(i), String(i*i))
for i in [1..N]
db.get('Square', String(i))
After Compiling, I get the following times:
$ time node test1.js 1000
real 0m3.759s
user 0m0.823s
sys 0m0.421s
After running many iterations, my times were similar at that volume regardless of backend. I tested ets and dets. The os will cache your disk blocks on the first run at a particular volume but subsequent runs are faster.
Following up on frank06's answer, I would also look into connection handling. This is not an issue with Riak, so much as it is an issue in how riak-js sets up it's connections. Also note that in Riak, all nodes are the same so if you had a three node cluster you would create connections to all three nodes and round robin them in some fashion. Protobuf api is the way to go but requires some extra care in setting up.

Related

face-api.js - Why is browser's faceapi.detectAllFaces() is faster than server's?

I want to use face detection on my server-side. Therefore, I found face-api.js for this task.
I discovered that each call of faceapi.detectAllFaces() lasts for ~10 seconds.
But when I start the browser-example, only the first function lasts 10 seconds and all the next lasts less than one second.
My server-side code (you can see a similar code in ageAndGenderRecognition.ts):
import * as faceapi from 'face-api.js';
import { canvas, faceDetectionNet, faceDetectionOptions, saveFile } from './commons';
await faceDetectionNet.loadFromDisk('../../weights')
await faceapi.nets.faceLandmark68Net.loadFromDisk('../../weights')
await faceapi.nets.ageGenderNet.loadFromDisk('../../weights')
const img = await canvas.loadImage('../images/bbt1.jpg')
console.time();
const results = await faceapi.detectAllFaces(img, faceDetectionOptions);
// ~10 seconds.
console.timeEnd();
console.time();
const results2 = await faceapi.detectAllFaces(img, faceDetectionOptions);
// ~10 seconds again.
console.timeEnd();
Why faceapi.detectAllFaces() (except first call) is faster in browser-example than in ageAndGenderRecognition.ts? And which similar thing I can to do to my faceapi.detectAllFaces()-function has the same speed?
There might be some reasons why your nodejs sample code runs for 10s:
You are not importing #tensorflow/tfjs-node at all, in this case tfjs does not use the native Tensorflow CPU backend and operations will take much longer on the CPU.
You are importing #tensorflow/tfjs-node but there is a version mismatch between the tfjs-core version that face-api.js and the version of #tensorflow/tfjs-node your have installed via npm. In this case tfjs will display a warning message.
Everything is set up correctly, but your CPU is just tremendously slow. In this case you can either try to use #tensorflow/tfjs-node-gpu (if you have a CUDA compatible nvidia GPU) or you can change the faceDetectionOptions to new faceapi.TinyFaceDetectorOptions(), which will run the TinyFaceDetector instead of the default SSD Mobilenet v1 model, which is much faster.
The reason why the first call in the browser takes that long is not due to the actual prediction time. It is because using the WebGL backend of tfjs, on the first run (warm up run) all the shader programs are compiled, which takes so long. Afterwards these are cached. The prediction in the browser takes only a few miliseconds because the WebGL backend is GPU accelerated. The 10s warm up time in the browser and the prediction time you are seeing in nodejs are not related at all.
Tensorflow.js will generally perform better when using a GPU (instead of a CPU).
So one thing that can explain the performance difference is that on the browser side, tensorflow will run on the GPU (via WebGL), whereas on node, it will run on the CPU (unless you are using #tensorflow/tfjs-node-gpu).
It seems that by default, the face-api.js library uses #tensorflow/tfjs-node (https://github.com/justadudewhohacks/face-api.js#face-apijs-for-nodejs).
So maybe you can try to replace the import with #tensorflow/tfjs-node-gpu.
In order to use the GPU on node, check the tfjs-node github : https://github.com/tensorflow/tfjs-node
As #justadudewhohacks mentioned , my Node.js code was running very slow and what solved it for me was simply adding this line to the top of the file:
import '#tensorflow/tfjs-node';
It is faster by a huge margin. If you have a NVIDIA graphics card you could also try importing
import '#tensorflow/tfjs-node-gpu';

Why is my meteor app taking so long to load?

Can you give some tips of what and where to check in order to improve my Meteor app loading performance? Currently my app is taking almost 15 seconds to load completely, which is insane.
I see the major loading time is in scripting (yellow) and the XHR bar is taking almost 6 seconds.
When I click on the scripts I can't get to my script names in order to review loops and indexes. All I see are function calls with function names not related to my own code, probably related to packages. Function names are like:
s.xhr.onreadystatechange # afaec39….js?meteor_js_resource=true:60
I have checked my subs/pubs and all of them are available on client within the first second.
#Ruben, let us know what you learned, if you were able to reduce meteor load time.
You are absolutely right with your assumption for Meteor build - it will perform the JavaScript code minification.
You should try running the performance recorder on localhost and - in the best case - it should give you a more detailed insight on how things really are.

Is console.log atomic?

The print statement in Python is not thread-safe. Is it safe to use console.log in Node.js concurrently?
If so, then is it also interleave-safe? That is, if multiple (even hundreds) of callbacks write to the console, can I be sure that the output won't be clobbered or interleaved?
Looking at the source code, it seems that Node.js queues concurrent attempts to write to a stream (here). On the other hand, console.log's substitution flags come from printf(3). If console.log wraps around printf, then that can interleave output on POSIX machines (as shown here).
Please show me where the async ._write(chunk, encoding, cb) is implemented inside Node.js in your response to this question.
EDIT: If it is fine to write to a stream concurrently, then why does this npm package exist?
Everything in node.js is basically "atomic". That's because node.js is single threaded - no code can ever be interrupted.
The events loop of nodejs is single thread, but all the async calls of nodejs are multi-threaded, it use libuv under the hood, libuv is library that use multi threads.
link:
https://medium.com/the-node-js-collection/what-you-should-know-to-really-understand-the-node-js-event-loop-and-its-metrics-c4907b19da4c
Based on what I see on my Node.js console it is NOT "interleave-safe".
I can see my console-output is sometimes "clobbered or interleaved". Not always. When I run my program it is maybe every 5th time that I see interleaved output from multiple log-statements.
This may of course depend on your Node.js version and the OS you are running it on. For the record my Node.js version is v12.13.0 and OS is Windows 10.0.19042.

Is console.time() safe in node.js?

I have a little snippet of node.js code in front of me that looks like this:
console.time("queryTime");
doAsyncIOBoundThing(function(err, results) {
console.timeEnd("queryTime");
// Process the results...
});
And of course when I run this on my (otherwise idle) development system, I get a nice console message like this:
queryTime: 564ms
However, if I put this into production, won't there likely be several async calls in progress simultaneously, and each of them will overwrite the previous timer? Or does node have some sort of magical execution context that gives each "thread of execution" a separate console timer namespace?
Just use unique labels and it will be safe. That's why you use a label, to uniquely identify the start time.
As long as you don't accidentally use a label twice everything will work exactly as intended. Also note that node has usually only one thread of execution.
Wouldn't this simple code work?
var labelWithTime = "label " + Date.now();
console.time(labelWithTime);
// Do something
console.timeEnd(labelWithTime);
Consider new NodeJS features as it has evolved too. Please look into:
process.hrtime() & NodeJS's other performance API hooks:
https://nodejs.org/api/perf_hooks.html#perf_hooks_performance_timing_api

How can I manage MSI session state within Javascript Custom Actions?

I have an ISAPI DLL, an add-on to IIS. I build the installer for it using WIX 3.0.
In the installer project, I have a number of custom actions implemented in Javascript. One of them, run at the initiation of the install, stops any IIS websites that are running. Another starts the IIS websites at the end of the install.
This stuff works, the CA's get invoked at the right times and under the right conditions. but the logic is naive. It stops all websites in the beginning (even if they are already stopped) and starts all websites at the end (even if they were previously stopped). This is obviously wrong.
What I'd like to do is keep track in the session of which websites required a stop at the beginning, and then, at the end, only try to restart those websites. Getting the state of a website is easy using the ServerState property on the CIM object. The question I have is, how should I store this information in the MSI session?
It's easy to stuff a single piece of information into a session Property, but what's the best way to store a set of N pieces of information, one for each website? In some cases there can be 1 website, in some cases, 51 websites.
I suppose I could use each distinct website name to create a distinct property name. Just not sure that is the best, most-efficient, most efficacious way to do things. Also, is it legal to use slashes in the name of an MSI Session property? (the website names will have slashes in them)
Suggestions?
You might want to check out:
VBScript (and Jscript) MSI CustomActions suck
C++ or C# is a much better choice. If your application already has dependencies on the framework then adding dependencies in your installer is a good logical choice. WiX has Deployment Tools Foundation ( DTF ) that has a custom action pattern that feels a lot jscript. You could then create a dictionary of websites and their run state and serialize it out to a single property. On the back side you could reconsitute that collection and then act upon it.
Not to mention the debugging story is MUCH better in DTF.
There's a simple solution. I was having a brain cramp.
All of the items I needed to store were strings - actually the names of websites that had been stopped during the installation. I just used the Javascript String.join method to create a single string, and the stuffed that into the session variable. Like this:
Session.Property("CA_STOPPEDSITES") = sitesThatWereStopped.join(",");
Then to retrieve that information later in another custom action, I do
var stoppedSites = Session.Property("CA_STOPPEDSITES");
if (stoppedSites != null) {
var sitesToStart = stoppedSites.split(",");
....
Simple, easy.

Categories

Resources