Leaderboard ranking with Firebase - javascript

I have project that I need to display a leaderboard of the top 20, and if the user not in the leaderboard they will appear in the 21st place with their current ranking.
Is there efficient way to this?
I am using Cloud Firestore as a database. I believe it was mistake to choose it instead of MongoDB but I am in the middle of the project so I must do it with Cloud Firestore.
The app will be use by 30K users. Is there any way to do it without getting all the 30k users?
this.authProvider.afs.collection('profiles', ref => ref.where('status', '==', 1)
.where('point', '>', 0)
.orderBy('point', 'desc').limit(20))
This is code I did to get the top 20 but what will be the best practice for getting current logged in user rank if they are not in the top 20?

Finding an arbitrary player's rank in leaderboard, in a manner that scales is a common hard problem with databases.
There are a few factors that will drive the solution you'll need to pick, such as:
Total Number players
Rate that individual players add scores
Rate that new scores are added (concurrent players * above)
Score range: Bounded or Unbounded
Score distribution (uniform, or are their 'hot scores')
Simplistic approach
The typical simplistic approach is to count all players with a higher score, eg SELECT count(id) FROM players WHERE score > {playerScore}.
This method works at low scale, but as your player base grows, it quickly becomes both slow and resource expensive (both in MongoDB and Cloud Firestore).
Cloud Firestore doesn't natively support count as it's a non-scalable operation. You'll need to implement it on the client-side by simply counting the returned documents. Alternatively, you could use Cloud Functions for Firebase to do the aggregation on the server-side to avoid the extra bandwidth of returning documents.
Periodic Update
Rather than giving them a live ranking, change it to only updating every so often, such as every hour. For example, if you look at Stack Overflow's rankings, they are only updated daily.
For this approach, you could schedule a function, or schedule App Engine if it takes longer than 540 seconds to run. The function would write out the player list as in a ladder collection with a new rank field populated with the players rank. When a player views the ladder now, you can easily get the top X + the players own rank in O(X) time.
Better yet, you could further optimize and explicitly write out the top X as a single document as well, so to retrieve the ladder you only need to read 2 documents, top-X & player, saving on money and making it faster.
This approach would really work for any number of players and any write rate since it's done out of band. You might need to adjust the frequency though as you grow depending on your willingness to pay. 30K players each hour would be $0.072 per hour($1.73 per day) unless you did optimizations (e.g, ignore all 0 score players since you know they are tied last).
Inverted Index
In this method, we'll create somewhat of an inverted index. This method works if there is a bounded score range that is significantly smaller want the number of players (e.g, 0-999 scores vs 30K players). It could also work for an unbounded score range where the number of unique scores was still significantly smaller than the number of players.
Using a separate collection called 'scores', you have a document for each individual score (non-existent if no-one has that score) with a field called player_count.
When a player gets a new total score, you'll do 1-2 writes in the scores collection. One write is to +1 to player_count for their new score and if it isn't their first time -1 to their old score. This approach works for both "Your latest score is your current score" and "Your highest score is your current score" style ladders.
Finding out a player's exact rank is as easy as something like SELECT sum(player_count)+1 FROM scores WHERE score > {playerScore}.
Since Cloud Firestore doesn't support sum(), you'd do the above but sum on the client side. The +1 is because the sum is the number of players above you, so adding 1 gives you that player's rank.
Using this approach, you'll need to read a maximum of 999 documents, averaging 500ish to get a players rank, although in practice this will be less if you delete scores that have zero players.
Write rate of new scores is important to understand as you'll only be able to update an individual score once every 2 seconds* on average, which for a perfectly distributed score range from 0-999 would mean 500 new scores/second**. You can increase this by using distributed counters for each score.
* Only 1 new score per 2 seconds since each score generates 2 writes
** Assuming average game time of 2 minute, 500 new scores/second could support 60000 concurrent players without distributed counters. If you're using a "Highest score is your current score" this will be much higher in practice.
Sharded N-ary Tree
This is by far the hardest approach, but could allow you to have both faster and real-time ranking positions for all players. It can be thought of as a read-optimized version of of the Inverted Index approach above, whereas the Inverted Index approach above is a write optimized version of this.
You can follow this related article for 'Fast and Reliable Ranking in Datastore' on a general approach that is applicable. For this approach, you'll want to have a bounded score (it's possible with unbounded, but will require changes from the below).
I wouldn't recommend this approach as you'll need to do distributed counters for the top level nodes for any ladder with semi-frequent updates, which would likely negate the read-time benefits.
Final thoughts
Depending on how often you display the leaderboard for players, you could combine approaches to optimize this a lot more.
Combining 'Inverted Index' with 'Periodic Update' at a shorter time frame can give you O(1) ranking access for all players.
As long as over all players the leaderboard is viewed > 4 times over the duration of the 'Periodic Update' you'll save money and have a faster leaderboard.
Essentially each period, say 5-15 minutes you read all documents from scores in descending order. Using this, keep a running total of players_count. Re-write each score into a new collection called scores_ranking with a new field players_above. This new field contains the running total excluding the current scores player_count.
To get a player's rank, all you need to do now is read the document of the player's score from score_ranking -> Their rank is players_above + 1.

One solution not mentioned here which I'm about to implement in my online game and may be usable in your use case, is to estimate the user's rank if they're not in any visible leaderboard because frankly the user isn't going to know (or care?) whether they're ranked 22,882nd or 22,838th.
If 20th place has a score of 250 points and there are 32,000 players total, then each point below 250 is worth on average 127 places, though you may want to use some sort of curve so as they move up a point toward bottom of the visible leaderboard they don't jump exactly 127 places each time - most of the jumps in rank should be closer to zero points.
It's up to you whether you want to identify this estimated ranking as an estimation or not, and you could add some a random salt to the number so it looks authentic:
// Real rank: 22,838
// Display to user:
player rank: ~22.8k // rounded
player rank: 22,882nd // rounded with random salt of 44
I'll be doing the latter.

Alternative perspective - NoSQL and document stores make this type of task overly complex. If you used Postgres this is pretty simple using a count function. Firebase is tempting because it's easy to get going with but use cases like this are when relational databases shine. Supabase is worth a look https://supabase.io/ similar to firebase so you can get going quickly with a backend but its opensource and built on Postgres so you get a relational database.

A solution that hasn't been mentioned by Dan is the use of security rules combined with Google Cloud Functions.
Create the highscore's map. Example:
highScores (top20)
Give the users write/read access to highScores.
Give the document/map highScores the smallest score in a property.
Let the users only write to highScores if his score > smallest score.
Create a write trigger in Google Cloud Functions that will activate when a new highScore is written. In that function, delete the smallest score.
This looks to me the easiest option. It is realtime as well.

You could do something with cloud storage. So manually have a file that has all the users' scores (in order), and then you just read that file and find the position of the score in that file.
Then to write to the file, you could set up a CRON job to periodically add all documents with a flag isWrittenToFile false, add them all to the file (and mark them as true). That way you won't eat up your writes. And reading a file every time the user wants to view their position is probably not that intensive. It could be done from a cloud function.

2022 Updated and Working Answer
To solve the problem of having a leaderboards with user and points, and to know your position in this leaderboards in an less problematic way, I have this solution:
1) You should create your Firestorage Document like this
In my case, I have a document perMission that has for each user a field, with the userId as property and the respective leaderboard points as value.
It will be easier to update the values inside my Javascript code.
For example, whenever an user completed a mission (update it's points):
import { doc, setDoc, increment } from "firebase/firestore";
const docRef = doc(db, 'leaderboards', 'perMission');
setDoc(docRef, { [userId]: increment(1) }, { merge: true });
The increment value can be as you want. In my case I run this code every time the user completes a mission, increasing the value.
2) To get the position inside the leaderboards
So here in your client side, to get your position, you have to order the values and then loop through them to get your position inside the leaderboards.
Here you can also use the object to get all the users and its respective points, ordered. But here I am not doing this, I am only interested in my position.
The code is commented explaining each block.
// Values coming from the database.
const leaderboards = {
userId1: 1,
userId2: 20,
userId3: 30,
userId4: 12,
userId5: 40,
userId6: 2
// Values coming from your user.
const myUser = "userId4";
const myPoints = leaderboards[myUser];
// Sort the values in decrescent mode.
const sortedLeaderboardsPoints = Object.values(leaderboards).sort(
(a, b) => b - a
// To get your specific position
const myPosition = sortedLeaderboardsPoints.reduce(
(previous, current, index) => {
if (myPoints === current) {
return index + 1 + previous;
return previous;
// Will print: [40, 30, 20, 12, 2, 1]
// Will print: 4
You can now use your position, even if the array is super big, the logic is running in the client side. So be careful with that. You can also improve the client side code, to reduce the array, limit it, etc.
But be aware that you should do the rest of the code in your client side, and not Firebase side.
This answer is mainly to show you how to store and use the database in a "good way".


Random IDs in JavaScript

I'm generating random IDs in javascript which serve as unique message identifiers for an analytics suite.
When checking the data (more than 10MM records), there are some minor collisions for some IDs for various reasons (network retries, robots faking data etc), but there is one in particular which has an intriguing number of collisions: akizow-dsrmr3-wicjw1-3jseuy.
The collision rate for the above id is at around 0.0037% while the rate for the other id collisions is under 0.00035% (10 times less) out of a sample of 111MM records from the same day. While the other ids are varying from day to day, this one remains the same, so for a longer period the difference is likely larger than 10x.
This is how the distribution of the top ID collisions looks like
This is the algorithm used to generate the random IDs:
function generateUUID() {
return [
generateUUID4(), generateUUID4(), generateUUID4(), generateUUID4()
function generateUUID4() {
return Math.abs(Math.random() * 0xFFFFFFFF | 0).toString(36);
I reversed the algorithm and it seems like for akizow-dsrmr3-wicjw1-3jseuy the browser's Math.random() is returning the following four numbers in this order: 0.1488114111471948, 0.19426893796638328, 0.45768366415465334, 0.0499740378116197, but I don't see anything special about them. Also, from the other data I collected it seems to appear especially after a redirect/preload (e.g. google results, ad clicks etc).
So I have 3 hypotheses:
There's a statistical problem with the algorithm that causes this specific collision
Redirects/preloads are somehow messing with the seed of the pseudo-random generator
A robot is smart enough that it fakes all the other data but for some reason is keeping the random id the same. The data comes from different user agents, IPs, countries etc.
Any idea what could cause this collision?

Upsert performance decreases with a growing collection (number of documents)

Use Case:
I'm consuming a REST Api which provides battle results of a video game. It is a team vs team online game and each team consists of 3 players who can pick different one from 100 different characters. I want to count the number of wins / losses and draws for each team combination. I get roughly 1000 battle results per second. I concatenate the character ids (ascending) of each team and then I save the wins/losses and draws for each combination.
My current implementation:
const combinationStatsSchema: Schema = new Schema({
combination: { type: String, required: true, index: true },
gameType: { type: String, required: true, index: true },
wins: { type: Number, default: 0 },
draws: { type: Number, default: 0 },
losses: { type: Number, default: 0 },
totalGames: { type: Number, default: 0, index: true },
battleDate: { type: Date, index: true, required: true }
For each returned log I perform an upsert and send these queries in bulk (5-30 rows) to MongoDB:
const filter: any = { combination: log.teamDeck, gameType, battleDate };
if (battleType === BattleType.PvP) {
filter.arenaId = log.arena.id;
const update: {} = { $inc: { draws, losses, wins, totalGames: 1 } };
My problem:
As long as I just have a few thousand entries in my collection combinationStats mongodb takes just 0-2% cpu. Once the collection has a couple million documents (which happens pretty quickly due to the amount of possible combinations) MongoDB constantly takes 50-100% cpu. Apparently my approach is not scalable at all.
My question:
Either of these options could be a solution to my above defined problem:
Can I optimize the performance of my MongoDB solution described above so that it doesn't take that much CPU? (I already indexed the fields I filter on and I perform upserts in bulk). Would it help to create a hash (based on all my filter fields) which I could use for filtering the data then to improve performance?
Is there a better database / technology suited to aggregate such data? I could imagine a couple more use cases where I want/need to increment a counter for a given identifier.
Edit: After khang commented that it might be related to the upsert performance I replaced my $inc with a $set and indeed the performance was equally "poor". Hence I tried the suggested find() and then manually update() approach but the results didn't become any better.
Create a hash on your filter conditions:
I was able to reduce the CPU from 80-90% down to 1-5% and experienced a higher throughoutput.
Apparently the filter was the problem. Instead of filtering on these three conditions: { combination: log.teamDeck, gameType, battleDate } I created a 128bit hash in my node application. I used this hash for upserting and set the combination, gameType and battleDate as additional fields in my update Document.
For creating the hash I used the metrohash library, which can be found here: https://github.com/jandrewrogers/MetroHash . Unfortunately I cannot explain why the performance is so much better, especially since I indexed all my previous conditions.
In (1.) you assert that you perform upserts in bulk. But based on how this seems to scale, you're probably sending too few rows into each batch. Consider doubling the batch size each time there's a doubling of stored rows. Please do post mongo's explain() query plan for your setup.
In (2.) you consider switching to, say, mysql or postgres. Yes, that would absolutely be a valid experiment. Again, be sure to post EXPLAIN output alongside your timing data.
There's only a million possible team compositions, and there's a distribution over those, with some being much more popular than others. You only need to maintain a million counters, which is not such a large number. However, doing 1e6 disk I/O's can take a while, especially if they are random reads. Consider moving away from a disk resident data structure, to which you might be doing frequent COMMITs, and switching to a memory resident hash or b-tree. It doesn't sound like ACID type of persistence guarantees are important to your application.
Also, once you have assembled "large" input batches, certainly more than a thousand and perhaps on the order of a million, do take care to sort the batch before processing. Then your counter maintenance problem just looks like merge-sort, either on internal memory or on external storage.
One principled approach to scaling your batches is to accumulate observations in some conveniently sized sorted memory buffer, and only release aggregate (counted) observations from that pipeline stage when number of distinct team compositions in the buffer is above some threshold K. Mongo or whatever would be the next stage in your pipeline. If K is much more than 1% of 1e6, then even a sequential scan of counters stored on disk would have a decent chance of finding useful update work to do on each disk block that is read.

Next steps to do with the mfccs, in voice recognition web based

I am working on urdu (language spoken in pakistan, india, bangladesh) voice recognition to translate urdu speech into urdu words. So far i did nothing but just have found meyda javascript library for extracting mfccs from data frames. Some document says that for ASR there needs first 12 or 13 mfccs out of 26. During the test, i have separate 46 phonemes(/b/, /g/, /d/ ...) in a folder in wav extension. After running meyda proccess on one of the phoneme, it creates 4 to 5 frames per phoneme, where each frame contain the mfccs each of first 12 values. Due to less than 10 reputation, post images are disabled. but you can the image on the following link. The image contain 7 frames of phoneme /b/. each frame includes 13 mfccs. The Red long vertical line value is 438, others or 48, 38 etc.
My question is that whether i need to save these frames(mfccs) in the database as predefined phoneme for /b/ and the same i do for all the other phonemes and then tie the microphone, meyda will extract the mfccs per frame, and i will programmed the javascript that the extracted frame mfcc will be matched with the predefined frames mfccs by using Dynamic Time Warping. And at the end will get the smallest distance for specific phoneme.
The proffesional way after mfccs are HMM and GMM but i dont know how to deal with. i studied so many documents about HMM and GMM but waste.
co-author of Meyda here.
That seems like a pretty difficult use case. If you already know how to split the buffers up into phonemes, you can run the MFCC extraction on those buffers, and use k Nearest Neighbour (or some better classification algorithm) for what I would imagine would be reasonable success rate.
A rough sketch:
const Meyda = require('meyda');
// I can't find a real KNN library because npm is down.
// I'm just using this as a placeholder for a real one.
const knn = require('knn');
// dataset should be a collection of labelled mfcc sets
const nearestPhoneme = knn(dataset);
const buffer = [...]; // a buffer containing a phoneme
let nearestPhonemes = []; // an array to store your phoneme matches
for(let i = 0; i < buffer.length; i += Meyda.bufferSize) {
nearestPhonemes.push(nearestPhoneme(Meyda.extract('mfcc', buffer)));
After this for loop, nearestPhonemes contains an array of the best guesses for phonemes for each frame of the audio. You could then pick the most commonly occurring phoneme in that array (the mode). I would also imagine that averaging the mfccs across the whole frame may yield a more robust result. It's certainly something you'll have to play around with and experiment with to find the most optimal solution.
Hope that helps! If you open source your code, I would love to see it.

Web Audio synthesis: how to handle changing the filter cutoff during the attack or release phase?

I'm building an emulation of the Roland Juno-106 synthesizer using WebAudio. The live WIP version is here.
I'm hung up on how to deal with updating the filter if the cutoff frequency or envelope modulation amount are changed during the attack or release while the filter is simultaneously being modulated by the envelope. That code is located around here. The current implementation doesn't respond the way an analog synth would, but I can't quite figure out how to calculate it.
On a real synth the filter changes immediately as determined by the frequency cutoff, envelope modulation amount, and current stage in the envelope, but the ramp up or down also continues smoothly.
How would I model this behavior?
Brilliant project!
You don't need to sum these yourself - Web Audio AudioParams sum their inputs, so if you have a potentially audio-rate modulation source like an LFO (an OscillatorNode connected to a GainNode), you simply connect() it to the AudioParam.
This is the key here - that AudioParams are able to be connect()ed to - and multiple input connections to a node or AudioParam are summed. So you generally want a model of
filter cutoff = (cutoff from envelope) + (cutoff from mod/LFO) + (cutoff from cutoff knob)
Since cutoff is a frequency, and thus on a log scale not a linear one, you want to do this addition logarithmically (otherwise, an envelope that boosts the cutoff up an octave at 440Hz will only boost it half an octave at 880Hz, etc.) - which, luckily, is easy to do via the "detune" parameter on a BiquadFilter.
Detune is in cents (1200/octave), so you have to use gain nodes to adjust values (e.g. if you want your modulation to have a +1/-1 octave range, make sure the oscillator output is going between -1200 and +1200). You can see how I do this bit in my Web Audio synthesizer (https://github.com/cwilso/midi-synth): in particular, check out synth.js starting around line 500: https://github.com/cwilso/midi-synth/blob/master/js/synth.js#L497-L519. Note the modFilterGain.connect(this.filter1.detune); in particular.
You don't want to be setting ANY values directly for modulation, since the actual value will change at a potentially fast rate - you want to use the parameter scheduler and input summing from an LFO. You can set the knob value as needed in terms of time, but it turns out that setting .value will interact poorly with setting scheduled values on the same AudioParam - so you'll need to have a separate (summed) input into the AudioParam. This is the tricky bit, and to be honest, my synth does NOT do this well today (I should change it to the approach described below).
The right way to handle the knob setting is to create an audio channel that varies based on your knob setting - that is, it's an AudioNode that you can connect() to the filter.detune, although the sample values produced by that AudioNode are only positive, and only change values when the knob is changed. To do this, you need a DC offset source - that is, an AudioNode that produces a stream of constant sample values. The simplest way I can think of to do this is to use an AudioBufferSourceNode with a generated buffer of 1:
function createDCOffset() {
var buffer=audioContext.createBuffer(1,1,audioContext.sampleRate);
var data = buffer.getChannelData(0);
var bufferSource=audioContext.createBufferSource();
return bufferSource;
Then, just connect that DCOffset into a gain node, and connect your "knob" to that gain's .value to use the gain node to scale the values (remember, there are 1200 cents in an octave, so if you want your knob to represent a six-octave cutoff range, the .value should go between zero and 7200). Then connect() the DCOffsetGain node into the filter's .detune (it sums with, rather than replacing, the connection from the LFO, and also sums with the scheduled values on the AudioParam (remember you'll need to scale the scheduled values in cents, too)). This approach, BTW, makes it easy to flip the envelope polarity too (that VCF ENV switch on the Juno 106) - just invert the values you set in the scheduler.
Hope this helps. I'm a bit jetlagged at the moment, so hopefully this was lucid. :)

Calculating bytes per second (the smooth way)

I am looking for a solution to calculate the transmitted bytes per second of a repeatedly invoked function (below). Due to its inaccuracy, I do not want to simply divide the transmitted bytes by the elapsed overall time: it resulted in the inability to display rapid speed changes after running for a few minutes.
The preset (invoked approximately every 50ms):
function uploadProgress(loaded, total){
var bps = ?;
$('#elem').html(bps+' bytes per second');
How to obtain the average bytes per second for (only) the last n seconds and is it a good idea?
What other practices for calculating a non-flickering but precise bps value are available?
Your first idea is not bad, it's called a moving average, and providing you call your update function in regular intervals you only need to keep a queue (a FIFO buffer) of a constant length:
var WINDOW_SIZE = 10;
var queue = [];
function updateQueue(newValue) {
// fifo with a fixed length
if (queue.length > WINDOW_SIZE)
function getAverageValue() {
// if the queue has less than 10 items, decide if you want to calculate
// the average anyway, or return an invalid value to indicate "insufficient data"
if (queue.length < WINDOW_SIZE) {
// you probably don't want to throw if the queue is empty,
// but at least consider returning an 'invalid' value in order to
// display something like "calculating..."
return null;
// calculate the average value
var sum = 0;
for (var i = 0; i < queue.length; i++) {
sum += queue[i];
return sum / queue.length;
// calculate the speed and call `updateQueue` every second or so
var updateTimer = setInterval(..., 1000);
An even simpler way to avoid sudden changes in calculated speed would be to use a low-pass filter. A simple discrete approximation of the PT1 filter would be:
Where u[k] is the input (or actual value) at sample k, y[k] is the output (or filtered value) at sample k, and T is the time constant (larger T means that y will follow u more slowly).
That would be translated to something like:
var speed = null;
function updateSpeed(newValue) {
if (speed === null) {
speed = newValue;
} else {
speed += (newValue - speed) / TIME_CONSTANT;
function getFilteredValue() {
return speed;
Both solutions will give similar results (for your purpose at least), and the latter one seems a bit simpler (and needs less memory).
Also, I wouldn't update the value that fast. Filtering will only turn "flickering" into "swinging" at a refresh rate of 50ms. I don't think anybody expects to have an upload speed shown at a refresh rate of more than once per second (or even a couple of seconds).
A simple low-pass filter is ok for just making sure that inaccuracies don't build up. But if you think a little deeper about measuring transfer rates, you get into maintaining separate integer counters to do it right.
If you want it to be an exact count, note that there is a simplification available. First, when dealing with rates, arithmetic mean of them is the wrong thing to apply to bytes/sec (sec/byte is more correct - which leads to harmonic mean). The other problem is that they should be weighted. Because of this, simply keeping int64 running totals of bytes versus observation time actually does the right thing - as stupid as it sounds. Normally, you are weighting by 1/n for each w. Look at a neat simplification that happens when you weigh by time:
(w0*b0/t0 + w1*b1/t1 + w2*b2/t2 + ...)/(w0+w1+w2+...)
So just keep separate (int64!) totals of bytes and milliseconds. And only divide them as a rendering step to visualize the rate. Note that if you instead used the harmonic mean (which you should do for rates - because you are really averaging sec/byte), then that's the same as the time it takes to send a byte, weighted by how many bytes there were.
1 / (( w0*t0/b0 + w1*t1/b0 + ... )/(w0+w1+w2+...)) =
So arithmetic mean weighted by time is same as harmonic mean weighted by bytes. Just keep a running total of bytes in one var, and time in another. There is a deeper reason that this simplistic count actually the right one. Think of integrals. Assuming no concurrency, this is literally just total bytes transferred divided by total observation time. Assume that the computer actually takes 1 step per millisecond, and only sends whole bytes - and that you observe the entire time interval without gaps. There are no approximations.
Notice that if you think about an integral with (msec, byte/msec) as the units for (x,y), the area under the curve is the bytes sent during the observation period (exactly). You will get the same answer no matter how the observations got cut up. (ie: reported 2x as often).
So by simply reporting (size_byte, start_ms,stop_ms), you just accumulate (stop_ms-start_ms) into time and accumulate size_byte per observation. If you want to partition these rates to graph in minute buckets, then just maintain the (byte,ms) pair per minute (of observation).
Note that these are rates experienced for individual transfers. The individual transfers may experience 1MB/s (user point of view). These are the rates that you guarantee to end users.
You can leave it here for simple cases. But doing this counting right, allows for more interesting things.
From the server point of view, load matters. Presume that there were two users experiencing 1MB/s simultaneously. For that statistic, you need to subtract out the double-counted time. If 2 users do 1MB/s simultaneously for 1s, then that's 2MB/s for 1s. You need to effectively reconstruct time overlaps, and subtract out the double-counting of time periods. Explicitly logging at the end of a transfer (size_byte,start_ms,stop_ms) allows you to measure interesting things:
The number of outstanding transfers at any given time (queue length distribution - ie: "am I going to run out of memory?")
The throughput as a function of the number of transfers (throughput for a queue length - ie: "does the website collapse when our ad shows on TV?")
Utilization - ie: "are we overpaying our cloud provider?"
In this situation, all of the accumulated counters are exact integer arithmetic. Subtracting out the double-counted time suddenly gets you into more complicated algorithms (when computed efficiently and in real-time).
Use a decaying average, then you won't have to keep the old values around.
UPDATE: Basically it's a formula like this:
average = new_value * factor + average_old * (100 - factor);
You don't have to keep any old values around, they're all in the there at smaller and smaller proportions. You have to choose a value for factor that are appropriate to the mix of new and old values you want, and how often the average gets updated.
This is how the Unix "load average" is calculated I believe.

