Related
My Mongoose schema uses a custom _id value and the code I inherited does something like this
const sampleSchema = new mongoose.Schema({
_id: String,
key: String,
});
sampleSchema.statics.generateId = async function() {
let id;
do {
id = randomStringGenerator.generate({length: 8, charset: 'hex', capitalization: 'uppercase'});
} while (await this.exists({_id: id}));
return id;
};
let SampleModel = mongoose.model('Sample', sampleSchema);
A simple usage looks like this:
let mySample = new SampleModel({_id: await SampleModel.generateId(), key: 'a' });
await mySample.save();
There are at least three problems with this:
Every save will require at least two trips to the database, one to test for a unique id and one to save the document.
For this to work, it is necessary to manually call generateId() before each save. An ideal solution would handle that for me, like Mongoose does with ids of type ObjectId.
Most significantly, there is a potential race condition that will result in duplicate key error. Consider two clients running this code. Both coincidentally generate the same id at the same time, both look in the database and find the id absent, both try to write the record to the database. The second will fail.
An ideal solution would, on save, generate an id, save it to the database and on duplicate key error, generate a new id and retry. Do this in a loop until the document is stored successfully.
The trouble is, I don't know how to get Mongoose to let me do this.
Here's what I tried: Based on this SO Question, I found a rather old sample (using a very old mongoose version) of overriding the save function to accomplish something similar and based this attempt off it.
// First, change generateId() to force a collision
let ids = ['a', 'a', 'a', 'b'];
let index = 0;
let generateId = function() {
return ids[index++];
};
// Configure middleware to generate the id before a save
sampleSchema.pre('validate', function(next) {
if (this.isNew)
this._id = generateId();
next();
});
// Now override the save function
SampleModel.prototype.save_original = SampleModel.prototype.save;
SampleModel.prototype.save = function(options, callback) {
let self = this;
let retryOnDuplicate = function(err, savedDoc) {
if (err) {
if (err.code === 11000 && err.name === 'MongoError') {
self.save(options, retryOnDuplicate);
return;
}
}
if (callback) {
callback(err, savedDoc);
}
};
return self.save_original(options, retryOnDuplicate);
}
This gets me close but I'm leaking a promise and I'm not sure where.
let sampleA = new SampleModel({key: 'a'});
let sampleADoc = await sampleA.save();
console.log('sampleADoc', sampleADoc); // prints undefined, but should print the document
let sampleB = new SampleModel({key: 'b'});
let sampleBDoc = await sampleB.save();
console.log('sampleBDoc', sampleBDoc); // prints undefined, but should print the document
let all = await SampleModel.find();
console.log('all', all); // prints `[]`, but should be an array of two documents
Output
sampleADoc undefined
sampleBDoc undefined
all []
The documents eventually get written to the database, but not before the console.log calls are made.
Where am I leaking a promise? Is there an easier way to do this that addresses the three problems I outlined?
Edit 1:
Mongoose version: 5.11.15
I fixed the problem by changing the save override. The full solution looks like this:
const sampleSchema = new mongoose.Schema({
_id: String,
color: String,
});
let generateId = function() {
return randomStringGenerator.generate({length: 8, charset: 'hex', capitalization: 'uppercase'});
};
sampleSchema.pre('validate', function() {
if (this.isNew)
this._id = generateId();
});
let SampleModel = mongoose.model('Sample', sampleSchema);
SampleModel.prototype.save_original = SampleModel.prototype.save;
SampleModel.prototype.save = function(options, callback) {
let self = this;
let isDupKeyError = (error, field) => {
// Determine whether the error is a duplicate key error on the given field
return error?.code === 11000 && error?.name === 'MongoError' && error?.keyValue[field];
}
let saveWithRetries = (options, callback) => {
// save() returns undefined if used with callback or a Promise otherwise.
// https://mongoosejs.com/docs/api/document.html#document_Document-save
let promise = self.save_original(options, callback);
if (promise) {
return promise.catch((error) => {
if (isDupKeyError(error, '_id')) {
return saveWithRetries(options, callback);
}
throw error;
});
}
};
let retryCallback;
if (callback) {
retryCallback = (error, saved, rows) => {
if (isDupKeyError(error, '_id')) {
saveWithRetries(options, retryCallback);
} else {
callback(error, saved, rows);
}
}
}
return saveWithRetries(options, retryCallback);
}
This will generate an _id repeatedly until a successful save is called and addresses the three problems outlined in the original question:
The minimum trips to the database has been reduced from two to one. Of course, if there are collisions, more trips will occur but that's the exceptional case.
This implementation takes care of generating the id itself with no manual step to take before saving. This reduces complexity and removes the required knowledge of prerequisites for saving that are present in the original method.
The race condition has been addressed. It won't matter if two clients attempt to use the same key. One will succeed and the other will generate a new key and save again.
To improve this:
There ought to be a maximum number of save attempts for a single document followed by failure. In this case, you've perhaps used up all the available keys in whatever domain you're using.
The unique field may not be named _id or you might have multiple fields that require a unique generated value. The embedded helper function isDupKeyError() could be updated to look for multiple keys. Then on error you could add logic to regenerate just the failed key.
I am trying to do Firestore reactive pagination. I know there are posts, comments, and articles saying that it's not possible but anyways...
When I add a new message, it kicks off or "removes" the previous message
Here's the main code. I'm paginating 4 messages at a time
async getPaginatedRTLData(queryParams: TQueryParams, onChange: Function){
let collectionReference = collection(firestore, queryParams.pathToDataInCollection);
let collectionReferenceQuery = this.modifyQueryByOperations(collectionReference, queryParams);
//Turn query into snapshot to track changes
const unsubscribe = onSnapshot(collectionReferenceQuery, (snapshot: QuerySnapshot) => {
snapshot.docChanges().forEach((change: DocumentChange<DocumentData>) => {
//Now save data to format later
let formattedData = this.storeData(change, queryParams)
onChange(formattedData);
})
})
this.unsubscriptions.push(unsubscribe)
}
For completeness this is how Im building my query
let queryParams: TQueryParams = {
limitResultCount: 4,
uniqueKey: '_id',
pathToDataInCollection: messagePath,
orderBy: {
docField: orderByKey,
direction: orderBy
}
}
modifyQueryByOperations(
collectionReference: CollectionReference<DocumentData> = this.collectionReference,
queryParams: TQueryParams) {
//Extract query params
let { orderBy, where: where_param, limitResultCount = PAGINATE} = queryParams;
let queryCall: Query<DocumentData> = collectionReference;
if(where_param) {
let {searchByField, whereFilterOp, valueToMatch} = where_param;
//collectionReferenceQuery = collectionReference.where(searchByField, whereFilterOp, valueToMatch)
queryCall = query(queryCall, where(searchByField, whereFilterOp, valueToMatch) )
}
if(orderBy) {
let { docField, direction} = orderBy;
//collectionReferenceQuery = collectionReference.orderBy(docField, direction)
queryCall = query(queryCall, fs_orderBy(docField, direction) )
}
if(limitResultCount) {
//collectionReferenceQuery = collectionReference.limit(limitResultCount)
queryCall = query(queryCall, limit(limitResultCount) );
}
if(this.lastDocInSortedOrder) {
//collectionReferenceQuery = collectionReference.startAt(this.lastDocInSortedOrder)
queryCall = query(queryCall, startAt(this.lastDocInSortedOrder) )
}
return queryCall
}
See the last line removed is removed when I add a new message to the collection. Whats worse is it's not consistent. I debugged this and Firestore is removing the message.
I almost feel like this is a bug in Firestore's handling of listeners
As mentioned in the comments and confirmed by you the problem you are facing is occuring due to the fact that some values of the fields that your are searching in your query changed while the listener was still active and this makes the listener think of this document as a removed one.
This is proven by the fact that the records are not being deleted from Firestore itself, but are just being excluded from the listener.
This can be fixed by creating a better querying structure, separating the old data from new data incoming from the listener, which you mentioned you've already done in the comments as well.
DevTools Google Chrome:
On this site (https://booyah.live/users/41874362/followers), to load the complete list of followers it is necessary to keep scrolling down the page to reload more profiles, but there comes a time when the page weighs so much that the browser crashes and it ends up needing to be closed.
Is there any way to be able to collect the follow buttons without this happening?
The current script I use is:
setInterval(function(){
document.getElementById("layout-content").scrollTo(0, 50000000000000000000000000000000000000);
document.querySelectorAll('.components-button.components-button-size-mini.components-button-type-orange.desktop.components-button-inline').forEach(btn => btn.click());
}, 10)
I use setInterval to create a loop of:
1 - Scrolling the page
2 - Loading more profiles
3 - Clicking the follow buttons
My need:
For the study I'm doing for learning, the idea is that my profile follows all profiles followers of a single most famous profile in order to analyze how many people follow back on this social media.
Additional:
In this answer provided by Leftium, it is possible to follow only one profile:
https://stackoverflow.com/a/67882688/11462274
In this answer given by KCGD, it is possible to collect the entire list of followers but during this collection the profiles are not followed, it is possible to create a list and save the data, but not follow the profiles:
https://stackoverflow.com/a/67865968/11462274
I tried to contact them both, but they haven't returned yet. It was a good way but I couldn't combine the two answers so I can follow all the profiles, I thought about the possibility according to which I would collect the profiles of the KCGD response, I would follow the profiles too, but not only the first one but also the answer of the Leftium.
Would it be possible to take advantage of the loop created by the response from KCGD and from each response, already follow all profiles instead of just the first one as in Leftium's response?
I tried to create but was unsuccessful.
The browser crashes because too much memory is used. As you scroll down the page, the HTML DOM tree is extended and more avatar images are downloaded. These HTML and image resources are not necessary for your goal.
It is possible to avoid crashing by calling the (internal) Booyah API directly. This will be much faster and consume less resources since only the text is transferred. There are two API endpoints of interest:
GET /api/v3/users/[USERID]/followers?cursor=0&count=100
Gets list of followers following a certain user:
[USERID] is the ID of the user being studied (WEEDZAO's id).
cursor is where in the list of followers to start listing. When the page first loads, this is 0. As you scroll down, the following API calls increment this (101, 201, 301...)
count is how many results to return.
Since this is a GET call, you can open this URL in your browser.
POST /api/v3/users/[USERID]/followings
Follows a user (same as clicking their 'Follow' button).
Here [USERID] is ID of the user whose follower list will be updated (your own ID).
A payload must be sent that looks like this: {followee_uid: ID, source: 43}. I'm not sure what source is.
Also a CSRF header must be included.
Because this is a POST type call, it is not possible to open this type of URL directly in your browser.
DELETE /api/v3/users/[USERID]/followings
There is also an API to unfollow a user. (Just for reference).
If you call these API's from outside the browser, you probably need to send session cookies.
This script will list WEEDZAO's first 10 followers, then follow the first one from the list:
You must replace USERID and CSRF_TOKEN with your own values.
You can copy/paste this code into the browser dev console.
Alternatively, you can use this code from a web scraping framework like Puppeteer.
// Find these values in dev console "Network" tab:
var CSRF_TOKEN, USERID, USERID_TARGET, main;
USERID_TARGET = '41874362';
USERID = '12345678';
CSRF_TOKEN = 'MTYy...p8wg';
main = async function() {
var body, followers, json, options, payload, response, url;
// Get list of first 10 followers
console.log(url = `/api/v3/users/${USERID_TARGET}/followers?cursor=0&count=10`);
response = (await fetch(url));
json = (await response.json());
followers = json.follower_list;
console.table(followers);
// Follow first member from list above
console.log(url = `/api/v3/users/${USERID}/followings`);
payload = JSON.stringify({
followee_uid: followers[0].uid,
source: 43
});
response = (await fetch(url, options = {
method: 'POST',
body: payload,
headers: {
'X-CSRF-Token': CSRF_TOKEN
}
}));
body = (await response.text());
return console.log(body);
};
main();
It crashes because the interval is too fast
setInterval(function(){}, 10)
you are trying to call a scroll and click function every 10 milliseconds (that's 100 function call every 1 second). Which also interferes with the server as they fetch new users while scrolling.
Your script could work if you will adjust the interval to atleast 1000 milliseconds (1 second). Of course, it may take a while, but it will work. You should also expect that the page may become laggy specially when the page already loaded tons of users because Virtual Scrolling is not implemented in this page.
Even with slowing down the rate of the scrolling it still really bogs down the browser, the solution to this may be in the API the page contacts. To get the user's followers it contacts the site's V3 API
https://booyah.live/api/v3/users/41874362/followers?cursor=[LAST USER IN API RETURN]&count=100
to get all the users that would show up in the page. I wrote a script that can contact the api over and over again to get all the follower data, just run it in the page's console and use print() when you want to export the data
and copy/paste it into a .json file
//WARNING: THIS SCRIPT USES RECURSION, i have no clue how long the followers list goes so use at your own risk
var followers = []; //data collected from api
function getFollowers(cursor){
httpGet(`https://booyah.live/api/v3/users/41874362/followers?cursor=${cursor}&count=100`, function (data) { //returns data from API for given cursor (user at the end of last follower chunk)
console.log("got cursor: "+cursor);
var _followChunk = JSON.parse(String(data));
console.log(_followChunk)
followers.push(_followChunk.follower_list); //saves followers from chunk
var last_user = _followChunk.follower_list[_followChunk.follower_list.length - 1]; //gets last user of chunk (cursor for the next chunk)
setTimeout(function(){ //1 second timeout so that the API doesnt return "too many requests", not nessicary but you should probably leave this on
getFollowers(last_user.uid); //get next chunk
},1000)
})
}
var print = function(){console.log(JSON.stringify(followers))};
getFollowers(0); //get initial set of followers (cursor 0)
function httpGet(theUrl, callback) {
var xmlHttp = new XMLHttpRequest();
xmlHttp.open("GET", theUrl, false); // false for synchronous request
xmlHttp.setRequestHeader("Cache-Control", "no-store");
xmlHttp.send(null);
callback(xmlHttp.responseText);
};
if you really only need the button elements then the only way is to scroll all the way down for each time it loads new followers, as the page creates the elements as you scroll down
This is a fully working solution that I have tested in my own Chrome browser with a fresh account, successfully following all the follower accounts of the account you are targeting.
UPDATE (2021-06-18)
I've updated my solution to a drastically improved and faster function, rewritten with async/await. This new function reduces the estimated runtime from ~45min to ~10min. 10min is still a long while, but that's to be expected considering the large number of followers the user you are targeting has.
After a few iterations, the latest function not only improves speed, performance, and error reporting, but it also extends what is possible with the function. I provide several example below my solutions of how to use the function completely.
For the sake of de-cluttering my answer, I am removing my older function from this solution altogether, but you can still reference it in my solution's edit history if you like.
TL;DR
Here is the final, fastest, working solution. Make sure to replace PUT_YOUR_CSRF_TOKEN_HERE with your own CSRF token value. Detailed instructions on how to find your CSRF token are below.
You must run this in your console on the Booyah website in order to avoid CORS issues.
const csrf = 'PUT_YOUR_CSRF_TOKEN_HERE';
async function booyahGetAccounts(uid, type = 'followers', follow = 1) {
if (typeof uid !== 'undefined' && !isNaN(uid)) {
const loggedInUserID = window.localStorage?.loggedUID;
if (uid === 0) uid = loggedInUserID;
const unfollow = follow === -1;
if (unfollow) follow = 1;
if (loggedInUserID) {
if (csrf) {
async function getUserData(uid) {
const response = await fetch(`https://booyah.live/api/v3/users/${uid}`),
data = await response.json();
return data.user;
}
const loggedInUserData = await getUserData(loggedInUserID),
targetUserData = await getUserData(uid),
followUser = uid => fetch(`https://booyah.live/api/v3/users/${loggedInUserID}/followings`, { method: (unfollow ? 'DELETE' : 'POST'), headers: { 'X-CSRF-Token': csrf }, body: JSON.stringify({ followee_uid: uid, source: 43 }) }),
logSep = (data = '', usePad = 0) => typeof data === 'string' && usePad ? console.log((data ? data + ' ' : '').padEnd(50, '━')) : console.log('━'.repeat(50),data,'━'.repeat(50));
async function getList(uid, type, follow) {
const isLoggedInUser = uid === loggedInUserID;
if (isLoggedInUser && follow && !unfollow && type === 'followings') {
follow = 0;
console.warn('You alredy follow your followings. `follow` mode switched to `false`. Followings will be retrieved instead of followed.');
}
const userData = await getUserData(uid),
totalCount = userData[type.slice(0,-1)+'_count'] || 0,
totalCountStrLength = totalCount.toString().length;
if (totalCount) {
let userIDsLength = 0;
const userIDs = [],
nickname = userData.nickname,
nicknameStr = `${nickname ? ` of ${nickname}'s ${type}` : ''}`,
alreadyFollowedStr = uid => `User ID ${uid} already followed by ${loggedInUserData.nickname} (Account #${loggedInUserID})`;
async function followerFetch(cursor = 0) {
const fetched = [];
await fetch(`https://booyah.live/api/v3/users/${uid}/${type}?cursor=${cursor}&count=100`).then(res => res.json()).then(data => {
const list = data[type.slice(0,-1)+'_list'];
if (list?.length) fetched.push(...list.map(e => e.uid));
if (fetched.length) {
userIDs.push(...fetched);
userIDsLength += fetched.length;
if (follow) followUser(uid);
console.log(`${userIDsLength.toString().padStart(totalCountStrLength)} (${(userIDsLength / totalCount * 100).toFixed(4)}%)${nicknameStr} ${follow ? 'followed' : 'retrieved'}`);
if (fetched.length === 100) {
followerFetch(data.cursor);
} else {
console.log(`END REACHED. ${userIDsLength} accounts ${follow ? 'followed' : 'retrieved'}.`);
if (!follow) logSep(targetList);
}
}
});
}
await followerFetch();
return userIDs;
} else {
console.log(`This account has no ${type}.`);
}
}
logSep(`${follow ? 'Following' : 'Retrieving'} ${targetUserData.nickname}'s ${type}`, 1);
const targetList = await getList(uid, type, follow);
} else {
console.error('Missing CSRF token. Retrieve your CSRF token from the Network tab in your inspector by clicking into the Network tab item named "bug-report-claims" and then scrolling down in the associated details window to where you see "x-csrf-token". Copy its value and store it into a variable named "csrf" which this function will reference when you execute it.');
}
} else {
console.error('You do not appear to be logged in. Please log in and try again.');
}
} else {
console.error('UID not passed. Pass the UID of the profile you are targeting to this function.');
}
}
booyahGetAccounts(41874362);
Detailed explanation of the process
As the function runs, it logs the progress to the console, both how many users have been followed so far, and how much progress has been made percentage-wise, based on the total number of followers the profile you are targeting has.
Retrieving your CSRF token
The only manual portion of this process is retrieving your CSRF token. This is rather simple though. Once you log into Booyah, navigate to the Network tab of your Chrome console and click on the item named bug-report-claims, then scroll all the way down the details window which appears on the right. There should see x-csrf-token. Store this value as a string variable in your console as csrf, which my function will reference when it runs. This is necessary in order to use the POST method to follow users.
Here is what it will look like:
The solution
The function will loop through all users the account you are targeting follows in batches of 100 (the max amount allowed per GET request) and follow them all. When the end of each batch is met, the next batch is automatically triggered recursively.
🚀 Version 3 (Fastest and most flexible, using async/await and fetch())
My previous two solution versions (🐇 …🐢) can be referenced in this answer's edit history.
Make sure to replace PUT_YOUR_CSRF_TOKEN_HERE with your own CSRF token value. Detailed instructions on how to find your CSRF token are below.
You must run this in your console on the Booyah website in order to avoid CORS issues.
const csrf = 'PUT_YOUR_CSRF_TOKEN_HERE';
async function booyahGetAccounts(uid, type = 'followers', follow = 1) {
if (typeof uid !== 'undefined' && !isNaN(uid)) {
const loggedInUserID = window.localStorage?.loggedUID;
if (uid === 0) uid = loggedInUserID;
const unfollow = follow === -1;
if (unfollow) follow = 1;
if (loggedInUserID) {
if (csrf) {
async function getUserData(uid) {
const response = await fetch(`https://booyah.live/api/v3/users/${uid}`),
data = await response.json();
return data.user;
}
const loggedInUserData = await getUserData(loggedInUserID),
targetUserData = await getUserData(uid),
followUser = uid => fetch(`https://booyah.live/api/v3/users/${loggedInUserID}/followings`, { method: (unfollow ? 'DELETE' : 'POST'), headers: { 'X-CSRF-Token': csrf }, body: JSON.stringify({ followee_uid: uid, source: 43 }) }),
logSep = (data = '', usePad = 0) => typeof data === 'string' && usePad ? console.log((data ? data + ' ' : '').padEnd(50, '━')) : console.log('━'.repeat(50),data,'━'.repeat(50));
async function getList(uid, type, follow) {
const isLoggedInUser = uid === loggedInUserID;
if (isLoggedInUser && follow && !unfollow && type === 'followings') {
follow = 0;
console.warn('You alredy follow your followings. `follow` mode switched to `false`. Followings will be retrieved instead of followed.');
}
const userData = await getUserData(uid),
totalCount = userData[type.slice(0,-1)+'_count'] || 0,
totalCountStrLength = totalCount.toString().length;
if (totalCount) {
let userIDsLength = 0;
const userIDs = [],
nickname = userData.nickname,
nicknameStr = `${nickname ? ` of ${nickname}'s ${type}` : ''}`,
alreadyFollowedStr = uid => `User ID ${uid} already followed by ${loggedInUserData.nickname} (Account #${loggedInUserID})`;
async function followerFetch(cursor = 0) {
const fetched = [];
await fetch(`https://booyah.live/api/v3/users/${uid}/${type}?cursor=${cursor}&count=100`).then(res => res.json()).then(data => {
const list = data[type.slice(0,-1)+'_list'];
if (list?.length) fetched.push(...list.map(e => e.uid));
if (fetched.length) {
userIDs.push(...fetched);
userIDsLength += fetched.length;
if (follow) followUser(uid);
console.log(`${userIDsLength.toString().padStart(totalCountStrLength)} (${(userIDsLength / totalCount * 100).toFixed(4)}%)${nicknameStr} ${follow ? 'followed' : 'retrieved'}`);
if (fetched.length === 100) {
followerFetch(data.cursor);
} else {
console.log(`END REACHED. ${userIDsLength} accounts ${follow ? 'followed' : 'retrieved'}.`);
if (!follow) logSep(targetList);
}
}
});
}
await followerFetch();
return userIDs;
} else {
console.log(`This account has no ${type}.`);
}
}
logSep(`${follow ? 'Following' : 'Retrieving'} ${targetUserData.nickname}'s ${type}`, 1);
const targetList = await getList(uid, type, follow);
} else {
console.error('Missing CSRF token. Retrieve your CSRF token from the Network tab in your inspector by clicking into the Network tab item named "bug-report-claims" and then scrolling down in the associated details window to where you see "x-csrf-token". Copy its value and store it into a variable named "csrf" which this function will reference when you execute it.');
}
} else {
console.error('You do not appear to be logged in. Please log in and try again.');
}
} else {
console.error('UID not passed. Pass the UID of the profile you are targeting to this function.');
}
}
Usage
To run the function (for either of the above solutions), just call the function name with the desired User ID name as an argument, in your example case, 41874362. The function call would look like this:
booyahGetAccounts(41874362);
The function is quite flexible in its abilities though. booyahGetAccounts() accepts three parameters, but only the first is required.
booyahGetAccounts(
uid, // required, no default
type = 'followers', // optional, must be 'followers' or 'followings' -> default: 'followers'
follow = 1 // optional, must be 0, 1, or -1, -> default: 1 (boolean true)
)
The second parameter, type, allows you to choose whether you would like to process the targeted user's followers or followings (the users which that user follows).
The third parameter allows you to choose whether you would like to follow/unfollow the returned users or only retrieve their User IDs. This defaults to 1 (boolean true) which will follow the users returned, but if you only want to test the function and not actually follow the returned users, set this to a falsy value such as 0 or false. Using -1 will unfollow the users returned.
This function intelligently retrieves your own User ID for you from the window.localStorage object, so you don't need to retrieve that yourself. If you would like to process your own followers or followings, simply pass 0 as the main uid parameter value, and the function will default the uid to your own User ID.
Because you can't re-follow users you already follow, if you try to follow your followings, the function will produce the warning You already follow your followings. 'follow' mode switched to 'false'. Followings will be retrieved instead of followed. and instead return them as if you had set the follow parameter to false.
However, it can be very useful to process your own list. For example, if you want to follow all of your own followers back, you could do so like this:
booyahGetAccounts(0); // `type` and `follow` parameters already default to the correct values here
On the other hand, if you were strategically using a follow/unfollow technique in order to increase your number of followers and needed to unfollow all of your followers, you could do so like this:
booyahGetAccounts(0, 'followers', -1);
By setting the follow parameter value to -1, you instruct the function to run its followUser function on all returned User IDs using the DELETE method instead of the POST method, thereby unfollowing those users returned instead of following them.
Desired outcome
Function call
Follow all your own followers
booyahGetAccounts(0, 'followers');
Unfollow all your own followers
booyahGetAccounts(0, 'followers', -1);
Unfollow all your own followings
booyahGetAccounts(0, 'followings', -1);
Follow users that follow User ID #12345
booyahGetAccounts(12345, 'followers');
Follow users followed by User ID #12345
booyahGetAccounts(12345, 'followings');
Retrieve User IDs of accounts following User ID #12345
booyahGetAccounts(12345, 'followers', 0);
Retrieve User IDs of accounts followed by User ID #12345
booyahGetAccounts(12345, 'followings', 0);
Other notes
To improve the performance of this function, as it's very heavy, I've replaced all calls to userIDs.length with a dedicated userIDsLength variable which I add to using += with each iteration rather than calling length each time. Similarly, I store the length of the stringified followerCount in the variable followerCountStrLength rather than calling followerCount.toString().length with each iteration. Because this is a rather heavy function, it is possible for your browser window to crash. However, it should eventually complete.
If the page appears to crash by flickering and auto-closing the console, FIRST try to re-open the console without refreshing the page at all. In my case, the inspector occasionally closed on its own, likely due to the exhaustion from the function, but when I opened the inspector's console again, the function was still running.
I want to use firebase's onSnapshot function sequentially. A situation where I want to apply this is given below.
Scenario:
There are 2 collections in firestore. Employees and Projects. In the Employees collection, the docs are storing the details of employees. And it also stores the IDs of Projects docs on which that particular employee is working. In Projects collection, the detail of projects is stored.
Goal:
First, I have to fetch the data from Employees collection related to a specific employee. Then, from the fetched employee data, I will have the project IDs on which he/she is working on. So, from that ID I need to fetch the project details. So, when any information related to project or employee changes, the data on screen should also change in real-time.
Issue:
I tried to write a nested code. But it works realtime only for employee data. It doesn't change when the project detail is updated. Something like this...
admin.auth().onAuthStateChanged(async () => {
if (check_field(admin.auth().currentUser)) {
await db.collection('Employees').doc(admin.auth().currentUser.uid).onSnapshot(snap => {
...
let project_details = new Promise(resolve => {
let projects = [];
for (let i in snap.data().projects_list) {
db.collection('Projects').doc(snap.data().projects_list[i]).onSnapshot(prj_snap => {
let obj = prj_snap.data();
obj['doc_id'] = prj_snap.id;
projects.push(obj);
});
}
resolve(projects);
});
Promise.all([project_details]).then(items => {
...
// UI updation
});
...
});
}
});
What is the correct way for doing this?
You're actually proposing a pretty complex dataflow scenario. I would approach this as a multi-step problem. Your goal is essentially:
If there is a user, listen in realtime for the list of project ids for that user.
For each project id, listen in realtime for details about that project.
(presumably) Clean up listeners that are no longer relevant.
So I would tackle it something like this:
let uid;
let employeeUnsub;
let projectIds = [];
let projectUnsubs = {};
let projectData = {};
const employeesRef = firebase.firestore().collection('Employees');
const projectsRef = firebase.firestore().collection('Projects');
firebase.auth().onAuthStateChanged(user => {
// if there is already a listener but the user signs out or changes, unsubscribe
if (employeeUnsub && (!user || user.uid !== uid)) {
employeeUnsub();
}
if (user) {
uid = user.uid;
// subscribe to the employee data and trigger a listener update on changes
employeeUnsub = employeesRef.doc(uid).onSnapshot(snap => {
projectIds = snap.get('projects_list');
updateProjectListeners();
});
}
});
function updateProjectListeners() {
// get a list of existing projects being listened already
let existingListeners = Object.keys(projectUnsubs);
for (const pid of existingListeners) {
// unsubscribe and remove the listener/data if no longer in the list
if (!projectIds.includes(pid)) {
projectUnsubs[pid]();
delete projectUnsubs[pid];
delete projectData[pid];
render();
}
}
for (const pid of projectIds) {
// if we're already listening, nothing to do so skip ahead
if (projectUnsubs[pid]) { continue; }
// subscribe to project data and trigger a render on change
projectUnsubs[pid] = projectsRef.doc(pid).onSnapshot(snap => {
projectData[pid] = snap.data);
render();
});
}
}
function render() {
const out = "<ul>\n";
for (const pid of projectIds) {
if (!projectData[pid]) {
out += `<li class="loading">Loading...</li>\n`;
} else {
const project = projectData[pid];
out += `<li>${project.name}</li>`;
}
}
out += "</ul>\n";
}
The above code does what you're talking about (and in this case the render() function just returns a string but you could do whatever you want to actually manipulate DOM / display data there).
It's a lengthy example, but you're talking about a pretty sophisticated concept of essentially joining realtime data dynamically as it changes. Hope this gives you some guidance on a way forward!
I am using TranscriptLoggerMiddleware and CosmosDB to log my chatbot transcripts. We are trying to capture the user state information (user name, account number, account type, etc) as top level attributes in the transcript so that specific customers can easily be queried in the DB (if that information is just in the individual timestamp attributes of the document, they can't be queried).
Ideally I would just add the user state when I'm building the file, but I can't figure any way to access it since the logger is defined in index.js and TranscriptLoggerMiddleware only provides the activity to my function, not the full context. If anyone has a way to get the user state data via TranscriptLoggerMiddleware, let me know, that would solve this issue. Here is the customLogger code. Note that due to the function receiving both the user query and bot response, I couldn't get retrieving and resaving the transcript to work, so I'm overwriting the transcript from a local log object. Not trying to come up with a new approach here but if one would solve the overall issue I'd like to hear it.
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.
const { CosmosDbPartitionedStorage } = require('botbuilder-azure');
const path = require('path');
/**
* CustomLogger, takes in an activity and saves it for the duration of the conversation, writing to an emulator compatible transcript file in the transcriptsPath folder.
*/
class CustomLogger {
/**
* Log an activity to the log file.
* #param activity Activity being logged.
*/
// Set up Cosmos Storage
constructor(appInsightsClient) {
this.transcriptStorage = new CosmosDbPartitionedStorage({
cosmosDbEndpoint: process.env.COSMOS_SERVICE_ENDPOINT,
authKey: process.env.COSMOS_AUTH_KEY,
databaseId: process.env.DATABASE,
containerId: 'bot-transcripts'
});
this.conversationLogger = {};
this.appInsightsClient = appInsightsClient;
this.msDelay = 250;
}
async logActivity(activity) {
if (!activity) {
throw new Error('Activity is required.');
}
// Log only if this is type message
if (activity.type === 'message') {
if (activity.attachments) {
try {
var logTextDb = `${activity.from.name}: ${activity.attachments[0].content.text}`;
} catch (err) {
var logTextDb = `${activity.from.name}: ${activity.text}`;
}
} else {
var logTextDb = `${activity.from.name}: ${activity.text}`;
}
if (activity.conversation) {
var id = activity.conversation.id;
if (id.indexOf('|') !== -1) {
id = activity.conversation.id.replace(/\|.*/, '');
}
// Get today's date for datestamp
var currentDate = new Date();
var day = currentDate.getDate();
var month = currentDate.getMonth()+1;
var year = currentDate.getFullYear();
var datestamp = year + '-' + month + '-' + day;
var fileName = `${datestamp}_${id}`;
var timestamp = Math.floor(Date.now()/1);
// CosmosDB logging (JK)
if (!(fileName in this.conversationLogger)) {
this.conversationLogger[fileName] = {};
this.conversationLogger[fileName]['userData'] = {};
this.conversationLogger[fileName]['botName'] = process.env.BOTNAME;
}
this.conversationLogger[fileName][timestamp] = logTextDb;
let updateObj = {
[fileName]:{
...this.conversationLogger[fileName]
}
}
// Add delay to ensure messages logged sequentially
await this.wait(this.msDelay);
try {
let result = await this.transcriptStorage.write(updateObj);
} catch(err) {
console.log(err);
this.appInsightsClient.trackTrace({message: `Logger Error ${err.code} - ${path.basename(__filename)}`,severity: 3,properties: {'botName': process.env.BOTNAME, 'error':err.body}});
}
}
}
}
async wait(milliseconds) {
var start = new Date().getTime();
for (var i = 0; i < 1e7; i++) {
if ((new Date().getTime() - start) > milliseconds) {
break;
}
}
}
}
exports.CustomLogger = CustomLogger;
Not being able to get user state in this function, I decided to try a few other approaches. The most promising was creating a separate "updateTranscript" function to grab the transcript, add user state, and save it back. But I think it was catching it only on user request and getting overidden again by local object on bot response. I added a delay to try to combat this, but it still didn't work. On my very first prompt of providing customer number user state data is getting stored on transcript, but at the next activity it is gone and never comes back (even though I can see it is supposedly getting written to DB). Here is that update function.
const { CosmosDbStorage } = require('botbuilder-azure');
var updateTranscript = async (context, userData, appInsightsClient) => {
const transcriptStorage = new CosmosDbStorage({
serviceEndpoint: process.env.COSMOS_SERVICE_ENDPOINT,
authKey: process.env.COSMOS_AUTH_KEY,
databaseId: process.env.DATABASE,
collectionId: 'bot-transcripts',
partitionKey: process.env.BOTNAME
});
var id = context.activity.conversation.id;
if (id.indexOf('|') !== -1) {
id = context.activity.conversation.id.replace(/\|.*/, '');
}
// Get today's date for datestamp
var currentDate = new Date();
var day = currentDate.getDate();
var month = currentDate.getMonth()+1;
var year = currentDate.getFullYear();
var datestamp = year + '-' + month + '-' + day;
var filename = `${datestamp}_${id}`;
var msDelay = 500;
await new Promise(resolve => setTimeout(resolve, msDelay));
var transcript = await transcriptStorage.read([filename]);
transcript[filename]['userData'] = userData
try {
await transcriptStorage.write(transcript);
console.log('User data added to transcript');
} catch(err) {
console.log(err);
appInsightsClient.trackTrace({message: `Log Updater Error ${err.code} - ${path.basename(__filename)}`,severity: 3,properties: {'botName': process.env.BOTNAME, 'error':err.body}});
}
return;
}
module.exports.updateTranscript = updateTranscript
I realize this approach is a bit of a cluster but I've been unable to find anything better. I know the Microsoft COVID-19 bot has a really nice transcript retrieval function, but I haven't been able to get any input from them on how that was accomplished. That aside, I'm quite happy to continue with this implementation if someone can help me figure out how to get that user state into the transcript without being overwritten or running into concurrency issues.
As to why I can't query an account number even via substring() function, here's an example of the documents data object. I have no idea which string to check for a substring, in this case 122809. I don't know what that timestamp could be. If this is stored at the top level (e.g. userData/accountNumber) I know exactly where to look for the value. For further context, I've displayed what I see after the first prompt for account number, where userData is populated. But it gets overidden on subsequent writes and I can't seem to get it back even with a delay in my updateTranscript function.
"document": {
"userData": {},
"botName": "AveryCreek_OEM_CSC_Bot_QA",
"1594745997562": "AveryCreek_OEM_CSC_Bot_QA: Hi! I'm the OEM CSC Support Bot! Before we get started, can you please provide me with your 6-digit Vista number? If you don't have one, just type \"Skip\".",
"1594746003973": "You: 122809",
"1594746004241": "AveryCreek_OEM_CSC_Bot_QA: Thank you. What can I help you with today? \r\nYou can say **Menu** for a list of common commands, **Help** for chatbot tips, or choose one of the frequent actions below. \r\n \r\n I'm still being tested, so please use our [Feedback Form](https://forms.office.com/Pages/ResponsePage.aspx?id=lVxS1ga5GkO5Jum1G6Q8xHnUJxcBMMdAqVUeyOmrhgBUNFI3VEhMU1laV1YwMUdFTkhYVzcwWk9DMiQlQCN0PWcu) to let us know how well I'm doing and how I can be improved!",
"1594746011384": "You: what is my account number?",
"1594746011652": "AveryCreek_OEM_CSC_Bot_QA: Here is the informaiton I have stored: \n \n**Account Number:** 122809 \n\n I will forget everything except your account number after the end of this conversation.",
"1594746011920": "AveryCreek_OEM_CSC_Bot_QA: I can clear your information if you don't want me to store it or if you want to reneter it. Would you like me to clear your information now?",
"1594746016034": "You: no",
"1594746016301": "AveryCreek_OEM_CSC_Bot_QA: OK, I won't clear your information. You can ask again at any time."
},
"document": {
"userData": {
"accountNumber": "122809"
},
"botName": "AveryCreek_OEM_CSC_Bot_QA",
"1594746019952": "AveryCreek_OEM_CSC_Bot_QA: Hi! I'm the OEM CSC Support Bot! What can I help you with today? \r\nYou can say **Menu** for a list of common commands, **Help** for chatbot tips, or choose one of the frequent actions below. \r\n \r\n I'm still being tested, so please use our [Feedback Form](https://forms.office.com/Pages/ResponsePage.aspx?id=lVxS1ga5GkO5Jum1G6Q8xHnUJxcBMMdAqVUeyOmrhgBUNFI3VEhMU1laV1YwMUdFTkhYVzcwWk9DMiQlQCN0PWcu) to let us know how well I'm doing and how I can be improved!"
},
You had said you were encountering concurrency issues even though JavaScript is single-threaded. As strange as that sounds, I think you're right on some level. TranscriptLoggerMiddleware does have its own buffer that it uses to store activities throughout the turn and then it tries to log all of them all at once. It could easily have provided a way to get that whole buffer in your own logger function, but instead it just loops through the buffer so that you still only get to log them each individually. Also, it allows logActivity to return a promise but it never awaits it, so each activity will get logged "simultaneously" (it's not really simultaneous but the code will likely jump between function calls before waiting for them to complete). This is a problem for any operation that isn't atomic, because you'll be modifying state without knowing about its latest modifications.
while (transcript.length > 0) {
try {
const activity: Activity = transcript.shift();
// If the implementation of this.logger.logActivity() is asynchronous, we don't
// await it as to not block processing of activities.
// Because TranscriptLogger.logActivity() returns void or Promise<void>, we capture
// the result and see if it is a Promise.
const logActivityResult = this.logger.logActivity(activity);
// If this.logger.logActivity() returns a Promise, a catch is added in case there
// is no innate error handling in the method. This catch prevents
// UnhandledPromiseRejectionWarnings from being thrown and prints the error to the
// console.
if (logActivityResult instanceof Promise) {
logActivityResult.catch(err => {
this.transcriptLoggerErrorHandler(err);
});
}
} catch (err) {
this.transcriptLoggerErrorHandler(err);
}
}
All in all, I don't think transcript logger middleware is the way to go here. While it may purport to serve your purposes, there are just too many problems with it. I would either write my own middleware or just put the middleware code directly in my bot logic like this:
async onTurn(turnContext) {
const activity = turnContext.activity;
await this.logActivity(turnContext, activity);
turnContext.onSendActivities(async (ctx, activities, next) => {
for (const activity of activities) {
await this.logActivity(ctx, activity);
}
return await next();
});
// Bot code here
// Save state changes
await this.userState.saveChanges(turnContext);
}
async logActivity(turnContext, activity) {
var transcript = await this.transcriptProperty.get(turnContext, []);
transcript.push(activity);
await this.transcriptProperty.set(turnContext, transcript);
console.log('Activities saved: ' + transcript.length);
}
Since your transcript would be stored in your user state, that user state would also have the account number you need and hopefully you'd be able to query for it.
Kyle's answer did help me solve the issue, and I think that will be the most reusable piece for anyone experiencing similar issues. The key takeaway is that, if you're using nodejs, you should not be using TranscriptLoggerMiddleware and instead use Kyle's function in your onTurn handler (repeated here for reference):
// Function provided by Kyle Delaney
async onTurn(turnContext) {
const activity = turnContext.activity;
await this.logActivity(turnContext, activity);
turnContext.onSendActivities(async (ctx, activities, next) => {
for (const activity of activities) {
await this.logActivity(ctx, activity);
}
return await next();
});
// Bot code here
// Save state changes
await this.userState.saveChanges(turnContext);
}
You need to note, though, that his logActivity function is just storing the raw activities to the user state using a custom transcriptProperty. As of yet I haven't found a good method to give business/admin users access to this data in a way that is easily readable and searchable, nor construct some sort of file out output to send to a customer requesting a transcript of their conversation. As such, I continued using my CustomLogger instead. Here is how I accomplished that.
First, you must create the transcriptLogger in the constructor. If you create it inside your turn handler, you will lose the cache/buffer and it will only have the latest activity instead of the full history. May be common sense but this tripped me up briefly. I do this in the constructor via this.transcriptLogger = new CustomerLogger(appInsightsClient);. I also modified my logActivity function to accept the userData (my state object) as a second, optional parameter. I have successfully been able to use that userData object to add the required customer information to the bot transcript. To modify Kyle's function above you just need to replace this.logActivity with your function call, in my case this.transcriptLogger.logActivity(context, userData);.
While there are still some other issues with this approach, it does solve the title question of how to get user state data into the transcript.