Finding index of a firestore document for a given query - javascript

I'm using Firestore to build a game and I'd like to show a list of high scores.
I'm currently getting the 10 highest scores with the following query:
const q = query(doc(db, "scores", title), orderBy("score", "desc"), limit(10));
In addition to this, I'd like to let the player know how they fared compared to global high scores. For example, if they got the 956th-highest score, I'd like them to know their relative position is 956.
I've seen that, for cursors, one can provide an offset with a given document, ie:
const q = query(doc(db, "scores", title), orderBy("score", "desc"), limit(10), startAt(myScoreDocRef));
Is there any way from this to get the score's logical index in the sorted result set?

Firestore recently added getCountFromServer() (API reference) function that is perfect for this use case. You can create a query that matches documents with score greater than current user's score and fetch the count as shown below:
const currentUserScore = 24;
const q = query(
collection(db, 'users'),
orderBy('score', 'desc'),
where('score', '>', currentUserScore)
)
// number of users with higher score
const snapshot = await getCountFromServer(q)
console.log(`Your rank: ${snapshot.data().count + 1}`)
However, in case multiple users have same score, they will all see same rank with this query. As a workaround you can add another parameter to the ranking apart from score, like querying users with same score and checking user age or so.
The count query costs only 1 read for each batch of up to 1000 index entries matched by the query (as per the documentation) so it's much for efficient than querying users with some offset and manually calculating the rank.

Related

firebase query at same values

I have a query:
const q = query(collection(db, 'users'), orderBy("highScore", "desc"));
Which returns all the users with all their highscores in descending order.
The problem is that there are users who got the same high scores, and now they are ordered I think randomly. So for example, if 3 users get 3000 score it just randomly orders those 3.
How can I for example look at another field(a timestamp) and order the last to achieve the 3000 pointsfirst and so on?
You can use orderBy again in order to sort them, try checking this:
const q = query(collection(db, 'users'), orderBy("highScore", "desc"), orderBy("timestamp", "desc"));
Or, you can do it client-side by sorting them.
You can check this.

Query firestore DB with array-contains and in (firebase 9)

In Firebase (v. 9), I have this firestore DB collection named "users". Each user has these fields: "gender" (string: 'male' or 'female'), "age" (number) and "language" (string: 'en-EN' or 'fr-FR' or 'es-ES' or 'de-DE' ).
From a filter checkbox menu, a user can select the languages then execute a query, get the result and than applied another filter and get another result.
For example:
I check, from the language menu "English" and "French" --> get the result, for example 3 users (2 female and 1 male). Then, from the gender menu, I check "male" --> get the result: just that one male user from the previous query result.
But a user can also do the first query for the language and then, in the second one, check both 'male' and 'female'.
I'm trying to do the query combining 'array-contains' and 'in' operator but I have no luck.
The query
const q = query(
collection(db, 'users'),
where('gender', 'array-contains', ['male', 'female']),
where('language', 'in', ['en-EN', 'es-ES']),
where('age', '>', 14),
where('age', '<', 40)
);
EDIT: For that query I changed my DB structure: gender has become an array but with 'array-contains' I can't do:
where('gender', 'array-contains', ['male', 'female'])
It must be something like that:
where('gender', 'array-contains', 'male')
but I want to check for both gender.
What could solve my problem is doing two queries with 'in' operator but I can't do that. (Firebase allows me to have only 1 'in' operator in the query).
My goal is, for example, to get every users in the DB that speak English or French, both male or female and with an age between 14 and 40. Is that the correct way to do this? How can I do the first query for the language and then, do another query starting from that result in order to avoid redoing the first query (the language) when I query for the gender and then when I query for the age? I also create indexes as Firebase suggested me to do, but I still get an empty array.
I was reading the firebase 'Query limitation' from the doc:
Cloud Firestore provides limited support for logical OR queries. The in, and array-contains-any operators support a logical OR of up to 10 equality (==) or array-contains conditions on a single field. For other cases, create a separate query for each OR condition and merge the query results in your app.
In a compound query, range (<, <=, >, >=) and not equals (!=, not-in) comparisons must all filter on the same field.
You can use at most one array-contains clause per query. You can't combine array-contains with array-contains-any.
You can use at most one in, not-in, or array-contains-any clause per query. You can't combine in , not-in, and array-contains-any in the same query.
You can't order your query by a field included in an equality (==) or in clause.
The sum of filters, sort orders, and parent document path (1 for a subcollection, 0 for a root collection) in a query cannot exceed 100.
That says I can combine the 'in' operator only with 'array-contains'. It also says, that "for other cases, create a separate query for each OR condition and merge the query results in your app" but I can't find any example on how to do that.
I read an answer here > Firebase Firestore - Multiple array-contains in a compound query where someone suggest to change the structure of the data to query, from an array to a map and then query with the equal operator:
where('field.name1', '==', true),
where('field.name2', '==', true)
I still have no luck with this.
Edit2: I guess the only thing I could do, is to execute 2 different queries, get the results in two different arrays and than do whatever logic I need to implement using js..I mean, not with firebase query operator. Can someone guide me through the process?
Any help is appreciated.
Thank you
Google Cloud Firestore only allows one in condition per query. You'll need to do the second one in JavaScript processing the results. Probably, the best way you can do is to get the result from the original query and process the result using Javascript. See sample code below:
const q = query(
collection(db, 'users'),
where('language', 'in', ['en-EN', 'es-ES']),
where('age', '>', 14),
where('age', '<', 40)
);
// Pass the data from the checkboxes.
// Can be 'male', 'female', or ('male' and 'female')
const gender = ['male', 'female'];
let array = [];
const snapshot = await getDocs(q);
snapshot.forEach((doc) => {
if (gender.includes(doc.data().gender)) {
array.push(doc.data());
}
});
console.log(array);
The above code will return the processed result whatever you pass on the gender variable. You could do it vice-versa, if you want to query the gender first then just interchange the query and variables.
Another option is to have a compound string, for example:
Checked:
Male
Female
en_EN
en_ES
Compound strings will be ['en_male', 'en_female', 'es_male', 'es_female']. You can query this by only one in statement. See sample code below:
// Combined data passed from the checkboxes.
// Can only be one and up to 10 comparison values.
const compound = ['en_male', 'en_female', 'es_male', 'es_female'];
const q = query(
collection(db, 'users'),
where('compound', 'in', compound),
where('age', '>', 14),
where('age', '<', 40)
);
const snapshot = await getDocs(q);
snapshot.forEach((doc) => {
console.log(doc.id, doc.data());
});
The downside of this approach is you can only have up to 10 comparison values for the in operator.
For more relevant information, you may check this documentation.

How to assign a 2 digit unique id to players

i am making a multiplayer game using html , node js and socket.io. Initially i was sending socket_id, player positions(x,y) and angle in game updates. As socket id is quite long and uses more bytes , i want to use a small (2 or 3 words/number) id to represent each player in place of socket id. Max a game can have 50 players. If i make id using random numbers(0-100) there are chances that id is already taken by some player(if there are already 30 to 40 players). What could be the better algorithm to assign id.
When a player dies or quit the game, that id is free so that it can be assigned(not necessarily) to new player.
One way is to simply increment:
let curId = 0;
//...
player.id = curId;
curId+=1;
If you want to make it random, follow this:
Make an array of numbers 0..100
Pick a random number for the ID from the array
Remove it from the array
Repeat
I would recommend the first approach because the second one limits the number of potential users. Though theoretically, if I had 10,000 known users, I would use the second approach and add the id back to the array when the user disconnected. So I didn't keep incrementing indefinitely.
Just save all used id's in an array and then check if id is unique:
const arrayOfIds = [];
const getRandomBetween1And60 = () => {
let newNumber = arrayOfIds[0] // just as start value
while(arrayOfIds.includes(newNumber)) {
newNumber = Math.floor(Math.random() * 60) + 1
}
return newNumber;
}
...
const newId = getRandomBetween1And60();

UDF worker timed out during execution when using javascript udf in BigQuery for tf idf calculation

I have tried to implement a query in BigQuery that finds top keywords for a doc from a larger collection of documents using tf-idf scores.
Before calculating the tf-idf score of the keywords, I clean the documents (e.g. removed stop words and punctuations) and then I create 1,2,3and 4-grams out of the documents and then do stemming inside the n-grams.
To perform this cleaning, n-gram creation and stemming I am using javascript libraries and js udf. Here is the example query:
CREATE TEMP FUNCTION nlp_compromise_tokens(str STRING)
RETURNS ARRAY<STRUCT<ngram STRING, count INT64>> LANGUAGE js AS '''
// creating 1,2,3 and 4 grams using compormise js
// before that I remove stopwords using .removeStopWords
// function lent from remove_stop_words.js
tokens_from_compromise = nlp(str.removeStopWords()).normalize().ngrams({max:4}).data()
// The stemming function that stems
// each space separated tokens inside the n-grams
// I use snowball.babel.js here
function stems_from_space_separated_string(tokens_string) {
var stem = snowballFactory.newStemmer('english').stem;
splitted_tokens = tokens_string.split(" ");
splitted_stems = splitted_tokens.map(x => stem(x));
return splitted_stems.join(" ")
}
// Returning the n-grams from compromise which are
// stemmed internally and at least length of 2
// alongside the count of the token inside the document
var ngram_count = tokens_from_compromise.map(function(item) {
return {
ngram: stems_from_space_separated_string(item.normal),
count: item.count
};
});
return ngram_count
'''
OPTIONS (
library=["gs://fh-bigquery/js/compromise.min.11.14.0.js","gs://syed_mag/js/snowball.babel.js","gs://syed_mag/js/remove_stop_words.js"]);
with doc_table as (
SELECT 1 id, "A quick brown 20 fox fox fox jumped over the lazy-dog" doc UNION ALL
SELECT 2, "another 23rd quicker browner fox jumping over Lazier broken! dogs." UNION ALL
SELECT 3, "This dog is more than two-feet away." #UNION ALL
),
ngram_table as(
select
id,
doc,
nlp_compromise_tokens(doc) as compromise_tokens
from
doc_table),
n_docs_table as (
select count(*) as n_docs from ngram_table
),
df_table as (
SELECT
compromise_token.ngram,
count(*) as df
FROM
ngram_table, UNNEST(compromise_tokens) as compromise_token
GROUP BY
ngram
),
idf_table as(
SELECT
ngram,
df,
n_docs,
LN((1+n_docs)/(1+df)) + 1 as idf_smooth
FROM
df_table
CROSS JOIN
n_docs_table),
tf_idf_table as (
SELECT
id,
doc,
compromise_token.ngram,
compromise_token.count as tf,
idf_table.ngram as idf_ngram,
idf_table.idf_smooth,
compromise_token.count * idf_table.idf_smooth as tf_idf
FROM
ngram_table, UNNEST(compromise_tokens) as compromise_token
JOIN
idf_table
ON
compromise_token.ngram = idf_table.ngram)
SELECT
id,
ARRAY_AGG(STRUCT(ngram,tf_idf)) as top_keyword,
doc
FROM(
SELECT
id,
doc,
ngram,
tf_idf,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY tf_idf DESC) AS rn
FROM
tf_idf_table)
WHERE
rn < 5
group by
id,
doc
Here is how the example output looks like:
There were only three sample handmade rows in this example.
When I try the same code with a little bit larger table with 1000 rows, it again works fine, although taking quite a bit of longer time to finish (around 6 minutes for only 1000 rows). This sample table (1MB) can be found here in json format.
Now when I try the query on a larger dataset (159K rows - 155MB) the query is exhausting after around 30 mins with the following message:
Errors: User-defined function: UDF worker timed out during execution.;
Unexpected abort triggered for worker worker-109498: job_timeout
(error code: timeout)
Can I improve my udf functions or the overall query structure to make sure it runs smoothly on even larger datasets (124,783,298 rows - 244GB)?
N.B. I have given proper permission to the js files in the google storage so that these javascrips are accessible by anyone to run the example queries.
BigQuery UDFs are very handy but are not computationally hangry and make your query slow or exhaust resources. See the doc reference for limitation and best practices. In general, any UDF logic you can convert in native SQL will be way faster and use fewer resources.
I would split your analysis into multiple steps saving the result into a new table for each step:
Clean the documents (e.g. removed stop words and punctuations)
Create 1,2,3and 4-grams out of the documents and then do stemming inside the n-grams.
Calculate the score.
Side note: you might be able to run it using multiple CTEs to save the stages instead of saving each step into a native table but I do not know if that will make the query exceed the resource limit.

How i can get a random user from my firebase user list?

I'm developing a app and need get a random user from my firebase user list. Whenever a user registers, the system updates the user count on an especific node. So, I draw a number from 1 to the total user. And now, how do I select a user based on that number?
Assuming all of your users are stored in a /users node with keys of their uid and assuming the uid's are ordered (which they always are), there are several options.
1) Load all of the users from the /users node into an array and select the one you want via it's index. Suppose we want the 4th user:
let usersRef = self.ref.child("users")
usersRef.observeSingleEvent(of: .value, with: { snapshot in
let allUsersArray = snapshot.children.allObjects
let thisUserSnap = allUsersArray[3]
print(thisUserSnap)
})
While this works for a small amount of users, it could overwhelm the device if you have say, 10,000 users and lots of data stored in each node.
2) Create a separate node to just store the uid's. This is a significantly smaller dataset and would work the same way as 1)
uids
uid_0: true
uid_1: true
uid_2: true
uid_3: true
uid_4: true
uid_5: true
uid_6: true
uid_7: true
3) Reduce the size of your dataset further. Since you know how many users you have, split the dataset up into two sections and work with that.
using the same structure as 2)
let uidNode = self.ref.child("uids")
let index = 4 //the node we want
let totalNodeCount = 8 //the total amount of uid's
let mid = totalNodeCount / 2 //the middle node
if index <= mid { //if the node we want is in the first 1/2 of the list
print("search first section")
let q = uidNode.queryLimited(toFirst: UInt(index) )
q.observeSingleEvent(of: .value, with: { snapshot in
let array = snapshot.children.allObjects
print(array.last) //the object we want will be the last one loaded
})
} else {
print("search second section")
let q = uidNode.queryLimited(toLast: UInt(index) )
q.observeSingleEvent(of: .value, with: { snapshot in
let array = snapshot.children.allObjects
print(array.first) //the object we want will be the first one loaded
})
}
this method only returns 1/2 of the list so it's a much more manageable amount of data.
If you are talking about your authenticated users, the only way to retreive a list of them is by calling the corresponding admin function and applying your logic to it afterwards.
Another way could be writing a trigger for your authentication and store the userId with an incrementing number (and maybe save a totalUser field), then you only need to generate a random number and access said user.

Categories

Resources