node hrTime as increment id - javascript

Node has process.hrtime as per:
process.hrtime()
process.hrtime.bigint()
Suppose there is an append only event stream that should be persisted (either in database or file).
How bad or good is idea to use node's hrtime as incremental ID for ordering purposes?
Is it safe to assume that since node is like single threaded event loop, that there won't be collisions or out-of-order timestamps in the scope of single process?
How about to use it across multiple hosts generating events for that event stream, assuming that all clocks are NTP synchronized what is the probability to get out-of-order events from different hosts?
What is the collision probability across multiple hosts assuming that all clocks NTP synchronized?
Since it will depend much on throughput of system, at what point problems will arise? 100, 1,000, 10,000, 100,000 events/sec?
Any open source project that uses hrtime for such purpose to refer?
For instance, system might not be running under very high throughput, but one action may cause multiple events, considering high performance CPUs, it is very high probability that calling new Date().getTime() multiple times in the scope of processing that action, will cause same millisecond-resolution timestamps to be generated.

Basically, on Linux, node's hrtime drills down to:
node node_process_methods.cc
libuv posix-hrtime.c
man CLOCK_GETRES (2)
libuv uses clock_gettime(CLOCK_MONOTONIC, ...) to provide hrtime. Many answers on SO can be found by looking for CLOCK_MONOTONIC.
As per manual:
CLOCK_MONOTONIC
Clock that cannot be set and represents monotonic time since — as described by POSIX—"some unspecified point in the past". On Linux, that point corresponds to the number of seconds that the system has been running since it was booted.
The CLOCK_MONOTONIC clock is not affected by discontinuous jumps in the system time (e.g., if the system administrator manually changes the clock), but is affected by the incremental adjustments performed by adjtime(3) and NTP. This clock does not count time that the system is suspended.
As result hrtime is not affected by NTP jumps. But it cannot be used as absolute time as well due to the same reasons. hrtime only meaningful to calculate time between executions.
However, assuming that node is kinda single threaded, and definition of CLOCK_MONOTONIC is always incremented (for instance here), subsequent calls to hrtime should provide results in incremented manner, regardless of throughput.
For the same reasons that node is kinda single threaded, the following simple function could workaround same millisecond-resolution times (as illustration):
let myLastTime = 0;
function getMyTime() {
const newTime = new Date().getTime();
if (newTime > myLastTime) {
myLastTime = newTime;
return myLastTime;
}
if (newTime === myLastTime) {
myLastTime = newTime + 1;
returm myLastTime;
}
// bad case if NTP jump for instance
// and we need always increment time
if (newTime < myLastTime) {
myLastTime = myLastTime + 1;
return myLastTime;
}
}
Solution illustrated above won't do well under high throughput. There are only 1000 milliseconds in a second, thus theoretically having throughput of 1000 or more events in a second, most likely will move myLastTime clock into the future, causing more problems than it solves. But might be good for throughput quite under 1000 events with single writer (i.e. not distributed multi-host scenario).
In distributed multi-host environment, time is bad reference for order in any case. Because each host's clocks may behave differently, also there will be NTP jumps impact etc.
Conclusion, to guarantee persisted order, either application should rely on underlying storage that guarantees it, or application should manage the order via logical sequence number or alike.

Related

Would calling Performance API frequently be causing a performance issue?

I want to measure the memory usage of my web SPA using performance.memory, and the purpose is to detect if there is any problem i.e. memory leak during the webapp's lifetime.
For this reason I would need to call this API for specific time interval - it could be every 3 second, every 30 second, or every 1 minute, ... Then I have a question - to detect any issue quickly and effectively I would have to make the interval as short as I could, but then I come up with the concern about performance. The measuring itself could affect the performance of the webapp if the measuring is such a expensive task (hopefully I don't think that is the case though)
With this background above, I have the following questions:
Is performance.memory such a method which would affect browser's main thread's performance so that I should care about the frequency of usage?
Would there be a right way or procedure to determine whether a (Javascript) task is affecting the performance of a device? If question 1 is uncertain, then I would have to try other way to find out the proper interval for calling of memory measurement.
(V8 developer here.)
Calling performance.memory is pretty fast. You can easily verify that in a quick test yourself: just call it a thousand times in a loop and measure how long that takes.
[EDIT: Thanks to #Kaiido for highlighting that this kind of microbenchmark can in general be very misleading; for example the first operation could be much more expensive; or the benchmark scenario could be so different from the real application's scenario that the results don't carry over. Do keep in mind that writing useful microbenchmarks always requires some understanding/inspection of what's happening under the hood!
In this particular case, knowing a bit about how performance.memory works internally, the results of such a simple test are broadly accurate; however, as I explain below, they also don't matter.
--End of edit]
However, that observation is not enough to solve your problem. The reason why performance.memory is fast is also the reason why calling it frequently is pointless: it just returns a cached value, it doesn't actually do any work to measure memory consumption. (If it did, then calling it would be super slow.) Here is a quick test to demonstrate both of these points:
function f() {
if (!performance.memory) {
console.error("unsupported browser");
return;
}
let objects = [];
for (let i = 0; i < 100; i++) {
// We'd expect heap usage to increase by ~1MB per iteration.
objects.push(new Array(256000));
let before = performance.now();
let memory = performance.memory.usedJSHeapSize;
let after = performance.now();
console.log(`Took ${after - before} ms, result: ${memory}`);
}
}
f();
(You can also see that browsers clamp timer granularity for security reasons: it's not a coincidence that the reported time is either 0ms or 0.1ms, never anything in between.)
(Second) however, that's not as much of a problem as it may seem at first, because the premise "to detect any issue quickly and effectively I would have to make the interval as short as I could" is misguided: in garbage-collected languages, it is totally normal that memory usage goes up and down, possibly by hundreds of megabytes. That's because finding objects that can be freed is an expensive exercise, so garbage collectors are carefully tuned for a good compromise: they should free up memory as quickly as possible without wasting CPU cycles on useless busywork. As part of that balance they adapt to the given workload, so there are no general numbers to quote here.
Checking memory consumption of your app in the wild is a fine idea, you're not the first to do it, and performance.memory is the best tool for it (for now). Just keep in mind that what you're looking for is a long-term upwards trend, not short-term fluctuations. So measuring every 10 minutes or so is totally sufficient, and you'll still need lots of data points to see statistically-useful results, because any single measurement could have happened right before or right after a garbage collection cycle.
For example, if you determine that all of your users have higher memory consumption after 10 seconds than after 5 seconds, then that's just working as intended, and there's nothing to be done. Whereas if you notice that after 10 minutes, readings are in the 100-300 MB range, and after 20 minutes in the 200-400 MB range, and after an hour they're 500-1000 MB, then it's time to go looking for that leak.

Accuracy of JavaScript time over a period of a few hours

I need to code myself a mini, locally running HTML5 + JavaScript app, which I will use as a timer to time a person performing squats.
The idea is simple: When I press A on the keyboard, it will store the current time with seconds and miliseconds into a local table as a repetition start. When I press B, it will store the current time as a repetition end.
What I'm not 100% sure about is how reliable the JavaScript timestamp really is. What is my best bet here? Here are a few ideas:
run it on the latest version of Chrome
disable the internet connection, so that the OS will not sync/change its current time
Is there anything else I should be careful about?
I don't need the time to be absolutely exact, only relatively; meaning that the last timestamp minus the first timestamp will yield the real time taken to perform the whole session. I don't care to know exactly at what time it started.
If you're retrieving the system time in Javascript with something like Date.now() in order to measure the time between two events, then that will be exactly as accurate as the system time is on the local computer. How exactly accurate that is will depend entirely upon the clock in the local system and whether there are any changes to the system time during the measurement period.
If there are no changes to the system time (such as a clock sync with an external source), then most system clocks are pretty darn accurate these days. Measuring an event that takes minutes would likely be accurate within a few milliseconds which is more accuracy than you can achieve by marking start and stop with just a keypress anyway since the precision on exactly when the key is pressed relative to the start and stop of the event is certainly not better than several hundred milliseconds.

Optimized Bulk (Chunk) Upload Of Objects Into IndexedDB

I want to add objects into some table in IndexedDB in one transaction:
_that.bulkSet = function(data, key) {
var transaction = _db.transaction([_tblName], "readwrite"),
store = transaction.objectStore(_tblName),
ii = 0;
_bulkKWVals.push(data);
_bulkKWKeys.push(key);
if (_bulkKWVals.length == 3000) {
insertNext();
}
function insertNext() {
if (ii < _bulkKWVals.length) {
store.add(_bulkKWVals[ii], _bulkKWKeys[ii]).onsuccess = insertNext;
++ii;
} else {
console.log(_bulkKWVals.length);
}
}
};
Looks like that it works fine, but it is not very optimized way of doing that especially if the number of objects is very high (~50.000-500.000). How could I possibly optimize it? Ideally I want to add first 3000, then remove it from the array, then add another 3000, namely in chunks. Any ideas?
Inserting that many rows consecutively, is not possible to get good performance.
I'm an IndexedDB dev and have real-world experience with IndexedDB at the scale you're talking about (writing hundreds of thousands of rows consecutively). It ain't too pretty.
In my opinion, IDB is not suitable for use when a large amount of data has to be written consecutively. If I were to architect an IndexedDB app that needed lots of data, I would figure out a way to seed it slowly over time.
The issue is writes, and the problem as I see it is that the slowness of writes, combined with their i/o intensive nature, makes gets worse over time. (Reads are always lightening fast in IDB, for what it's worth.)
To start, you'll get savings from re-using transactions. Because of that your first instinct might be to try to cram everything into the same transaction. But from what I've found in Chrome, for example, is that the browser doesn't seem to like long-running writes, perhaps because of some mechanism meant to throttle misbehaving tabs.
I'm not sure what kind of performance you're seeing, but average numbers might fool you depending on the size of your test. The limiting faster is throughput, but if you're trying to insert large amounts of data consecutively pay attention to writes over time specifically.
I happen to be working on a demo with several hundred thousand rows at my disposal, and have stats. With my visualization disabled, running pure dash on IDB, here's what I see right now in Chrome 32 on a single object store with a single non-unique index with an auto-incrementing primary key.
A much, much smaller 27k row dataset, I saw 60-70 entries/second:
* ~30 seconds: 921 entries/second on average (there's always a great burst of inserts at the start), 62/second at the moment I sampled
* ~60 seconds: 389/second average (sustained decreases starting to outweigh effect initial burst) 71/second at moment
* ~1:30: 258/second, 67/second at moment
* ~2:00 (~1/3 done): 188/second on average, 66/second at moment
Some examples with a much smaller dataset show far better performance, but similar characteristics. Ditto much larger datasets - the effects are greatly exaggerated and I've seen as little as <1 entries per second when leaving for multiple hours.
IndexedDB is actually designed to optimize for bulk operations. The problem is that the spec and certain docs does not advertice the way it works. If paying certain attention to the parts in the IndexedDB specification that defines how all the mutating operations in IDBObjectStore works (add(), put(), delete()), you'll find out that it allow callers to call them synchronously and omit listening to the success events but the last one. By omitting doing that (but still listen to onerror), you will get enormous performance gains.
This example using Dexie.js shows the possible bulk speed as it inserts 10,000 rows in 680 ms on my macbook pro (using Opera/Chromium).
Accomplished by the Table.bulkPut() method in the Dexie.js library:
db.objects.bulkPut(arrayOfObjects)

Audio sync, call function every 1 / 44.1 millisecond

In JavaScript, is it possible to call a function playing 10 different wav sounds at 44.1 kHz and call that same function again in (1/44100)*(128/60)*16 seconds with a 1/44.1 millisecond precision preferably with chrome/safari and in that case how?
I'm looking at making a music loop machine playing a few simultaneous loops. The precision is needed otherwise there will be unwanted hearable issues with the sounds (phasing).
Robert,
It's possible to measure time with high accuracy - via performance.now() - but you cannot get a callback with that kind of precision. In fact, in light of layout passes and JavaScript execution in the main thread, and the ever-looming threat of garbage collection happening in the main thread, you can't get anywhere NEAR even millisecond precision; you generally ought to be planning on potential interruptions in the tens of milliseconds for robustness.
The answer to this is to use scheduling, particularly in the Web Audio API - I see that you saw the article I wrote about this a year ago on HTML5Rocks (http://www.html5rocks.com/en/tutorials/audio/scheduling/), but you missed the significant piece - you shouldn't be calling
audioSource2.noteOn(0, 0.1190, 1.875);
you need the time offset to schedule it ahead appropriately:
audioSource2.noteOn(time, 0.1190, 1.875);
If you look at my original code, that's how I'm scheduling the oscillator ahead of time. The scheduler runs in a "slow" callback loop - being called only every 100ms or so - but schedules ahead a few beats. If you truly need to mute notes that may already be scheduled in the next 1/10th of a second, then you can keep a node in the middle to disconnect().
I would take a look at either DOM High Resolution timestamp, which can be accessed with window.performance.now(), or request Animation Frame, with window.requestAnimationFrame
You can use this library which I have written : https://github.com/sebpiq/WAAClock
It lets you schedule things precisely and easily and also provides useful functionalities such as : cancel event, change tempo, ... everything necessary for a loop machine. Under the hood, it implements the tricks explained in this article (already linked by other people) : http://www.html5rocks.com/en/tutorials/audio/scheduling/
If by loop machine you mean continuously looping a few samples (and not a drum machine, where you just play a sample at a point in time), you might also want to look into this : https://github.com/sebpiq/WAATableNode

Microsecond timing in JavaScript

Are there any timing functions in JavaScript with microsecond resolution?
I am aware of timer.js for Chrome, and am hoping there will be a solution for other friendly browsers, like Firefox, Safari, Opera, Epiphany, Konqueror, etc. I'm not interested in supporting any IE, but answers including IE are welcome.
(Given the poor accuracy of millisecond timing in JS, I'm not holding my breath on this one!)
Update: timer.js advertises microsecond resolution, but it simply multiplies the millisecond reading by 1,000. Verified by testing and code inspection. Disappointed. :[
As alluded to in Mark Rejhon's answer, there is an API available in modern browsers that exposes sub-millisecond resolution timing data to script: the W3C High Resolution Timer, aka window.performance.now().
now() is better than the traditional Date.getTime() in two important ways:
now() is a double with submillisecond resolution that represents the number of milliseconds since the start of the page's navigation. It returns the number of microseconds in the fractional (e.g. a value of 1000.123 is 1 second and 123 microseconds).
now() is monotonically increasing. This is important as Date.getTime() can possibly jump forward or even backward on subsequent calls. Notably, if the OS's system time is updated (e.g. atomic clock synchronization), Date.getTime() is also updated. now() is guaranteed to always be monotonically increasing, so it is not affected by the OS's system time -- it will always be wall-clock time (assuming your wall clock is not atomic...).
now() can be used in almost every place that new Date.getTime(), + new Date and Date.now() are. The exception is that Date and now() times don't mix, as Date is based on unix-epoch (the number of milliseconds since 1970), while now() is the number of milliseconds since your page navigation started (so it will be much smaller than Date).
now() is supported in Chrome stable, Firefox 15+, and IE10. There are also several polyfills available.
Note: When using Web Workers, the window variable isn't available, but you can still use performance.now().
There's now a new method of measuring microseconds in javascript:
http://gent.ilcore.com/2012/06/better-timer-for-javascript.html
However, in the past, I found a crude method of getting 0.1 millisecond precision in JavaScript out of a millisecond timer. Impossible? Nope. Keep reading:
I'm doing some high-precisio experiments that requires self-checked timer accuracies, and found I was able to reliably get 0.1 millisecond precision with certain browsers on certain systems.
I have found that in modern GPU-accelerated web browsers on fast systems (e.g. i7 quad core, where several cores are idle, only browser window) -- I can now trust the timers to be millisecond-accurate. In fact, it's become so accurate on an idle i7 system, I've been able to reliably get the exact same millisecond, over more than 1,000 attempts. Only when I'm trying to do things like load an extra web page, or other, the millisecond accuracy degrades (And I'm able to successfully catch my own degraded accuracy by doing a before-and-after time check, to see if my processing time suddenly lengthened to 1 or more milliseconds -- this helps me invalidate results that has probably been too adversely affected by CPU fluctuations).
It's become so accurate in some GPU accelerated browsers on i7 quad-core systems (when the browser window is the only window), that I've found I wished I could access a 0.1ms precision timer in JavaScript, since the accuracy is finally now there on some high-end browsing systems to make such timer precision worthwhile for certain types of niche applications that requires high-precision, and where the applications are able to self-verify for accuracy deviations.
Obviously if you are doing several passes, you can simply run multiple passes (e.g. 10 passes) then divide by 10 to get 0.1 millisecond precision. That is a common method of getting better precision -- do multiple passes and divide the total time by number of passes.
HOWEVER...If I can only do a single benchmark pass of a specific test due to an unusually unique situation, I found out that I can get 0.1 (And sometimes 0.01ms) precision by doing this:
Initialization/Calibration:
Run a busy loop to wait until timer increments to the next millisecond (align timer to beginning of next millisecond interval) This busy loop lasts less than a millisecond.
Run another busy loop to increment a counter while waiting for timer to increment. The counter tells you how many counter increments occured in one millisecond. This busy loop lasts one full millisecond.
Repeat the above, until the numbers become ultra-stable (loading time, JIT compiler, etc). 4. NOTE: The stability of the number gives you your attainable precision on an idle system. You can calculate the variance, if you need to self-check the precision. The variances are bigger on some browsers, and smaller on other browsers. Bigger on faster systems and slower on slower systems. Consistency varies too. You can tell which browsers are more consistent/accurate than others. Slower systems and busy systems will lead to bigger variances between initialization passes. This can give you an opportunity to display a warning message if the browser is not giving you enough precision to allow 0.1ms or 0.01ms measurements. Timer skew can be a problem, but some integer millisecond timers on some systems increment quite accurately (quite right on the dot), which will result in very consistent calibration values that you can trust.
Save the final counter value (or average of the last few calibration passes)
Benchmarking one pass to sub-millisecond precision:
Run a busy loop to wait until timer increments to the next millisecond (align timer to beginning of next millisecond interval). This busy loop lasts less than a millisecond.
Execute the task you want to precisely benchmark the time.
Check the timer. This gives you the integer milliseconds.
Run a final busy loop to increment a counter while waiting for timer to increment. This busy loop lasts less than a millisecond.
Divide this counter value, by the original counter value from initialization.
Now you got the decimal part of milliseconds!!!!!!!!
WARNING: Busy loops are NOT recommended in web browsers, but fortunately, these busy loops run for less than 1 millisecond each, and are only run a very few times.
Variables such as JIT compilation and CPU fluctuations add massive inaccuracies, but if you run several initialization passes, you'll have full dynamic recompilation, and eventually the counter settles to something very accurate. Make sure that all busy loops is exactly the same function for all cases, so that differences in busy loops do not lead to differences. Make sure all lines of code are executed several times before you begin to trust the results, to allow JIT compilers to have already stabilized to a full dynamic recompilation (dynarec).
In fact, I witnessed precision approaching microseconds on certain systems, but I wouldn't trust it yet. But the 0.1 millisecond precision appears to work quite reliably, on an idle quad-core system where I'm the only browser page. I came to a scientific test case where I could only do one-off passes (due to unique variables occuring), and needed to precisely time each pass, rather than averaging multiple repeat pass, so that's why I did this.
I did several pre-passes and dummy passes (also to settle the dynarec), to verify reliability of 0.1ms precision (stayed solid for several seconds), then kept my hands off the keyboard/mouse, while the benchmark occured, then did several post-passes to verify reliability of 0.1ms precision (stayed solid again). This also verifies that things such as power state changes, or other stuff, didn't occur between the before-and-after, interfering with results. Repeat the pre-test and post-test between every single benchmark pass. Upon this, I was quite virtually certain the results in between were accurate. There is no guarantee, of course, but it goes to show that accurate <0.1ms precision is possible in some cases in a web browser.
This method is only useful in very, very niche cases. Even so, it literally won't be 100% infinitely guaranteeable, you can gain quite very trustworthy accuracy, and even scientific accuracy when combined with several layers of internal and external verifications.
Here is an example showing my high-resolution timer for node.js:
function startTimer() {
const time = process.hrtime();
return time;
}
function endTimer(time) {
function roundTo(decimalPlaces, numberToRound) {
return +(Math.round(numberToRound + `e+${decimalPlaces}`) + `e-${decimalPlaces}`);
}
const diff = process.hrtime(time);
const NS_PER_SEC = 1e9;
const result = (diff[0] * NS_PER_SEC + diff[1]); // Result in Nanoseconds
const elapsed = result * 0.0000010;
return roundTo(6, elapsed); // Result in milliseconds
}
Usage:
const start = startTimer();
console.log('test');
console.log(`Time since start: ${endTimer(start)} ms`);
Normally, you might be able to use:
console.time('Time since start');
console.log('test');
console.timeEnd('Time since start');
If you are timing sections of code that involve looping, you cannot gain access to the value of console.timeEnd() in order to add your timer results together. You can, but it get gets nasty because you have to inject the value of your iterating variable, such as i, and set a condition to detect if the loop is done.
Here is an example because it can be useful:
const num = 10;
console.time(`Time til ${num}`);
for (let i = 0; i < num; i++) {
console.log('test');
if ((i+1) === num) { console.timeEnd(`Time til ${num}`); }
console.log('...additional steps');
}
Cite: https://nodejs.org/api/process.html#process_process_hrtime_time
The answer is "no", in general. If you're using JavaScript in some server-side environment (that is, not in a browser), then all bets are off and you can try to do anything you want.
edit — this answer is old; the standards have progressed and newer facilities are available as solutions to the problem of accurate time. Even so, it should be remembered that outside the domain of a true real-time operating system, ordinary non-privileged code has limited control over its access to compute resources. Measuring performance is not the same (necessarily) as predicting performance.
editing again — For a while we had performance.now(), but at present (2022 now) browsers have degraded the accuracy of that API for security reasons.

Categories

Resources