Related
DevTools Google Chrome:
On this site (https://booyah.live/users/41874362/followers), to load the complete list of followers it is necessary to keep scrolling down the page to reload more profiles, but there comes a time when the page weighs so much that the browser crashes and it ends up needing to be closed.
Is there any way to be able to collect the follow buttons without this happening?
The current script I use is:
setInterval(function(){
document.getElementById("layout-content").scrollTo(0, 50000000000000000000000000000000000000);
document.querySelectorAll('.components-button.components-button-size-mini.components-button-type-orange.desktop.components-button-inline').forEach(btn => btn.click());
}, 10)
I use setInterval to create a loop of:
1 - Scrolling the page
2 - Loading more profiles
3 - Clicking the follow buttons
My need:
For the study I'm doing for learning, the idea is that my profile follows all profiles followers of a single most famous profile in order to analyze how many people follow back on this social media.
Additional:
In this answer provided by Leftium, it is possible to follow only one profile:
https://stackoverflow.com/a/67882688/11462274
In this answer given by KCGD, it is possible to collect the entire list of followers but during this collection the profiles are not followed, it is possible to create a list and save the data, but not follow the profiles:
https://stackoverflow.com/a/67865968/11462274
I tried to contact them both, but they haven't returned yet. It was a good way but I couldn't combine the two answers so I can follow all the profiles, I thought about the possibility according to which I would collect the profiles of the KCGD response, I would follow the profiles too, but not only the first one but also the answer of the Leftium.
Would it be possible to take advantage of the loop created by the response from KCGD and from each response, already follow all profiles instead of just the first one as in Leftium's response?
I tried to create but was unsuccessful.
The browser crashes because too much memory is used. As you scroll down the page, the HTML DOM tree is extended and more avatar images are downloaded. These HTML and image resources are not necessary for your goal.
It is possible to avoid crashing by calling the (internal) Booyah API directly. This will be much faster and consume less resources since only the text is transferred. There are two API endpoints of interest:
GET /api/v3/users/[USERID]/followers?cursor=0&count=100
Gets list of followers following a certain user:
[USERID] is the ID of the user being studied (WEEDZAO's id).
cursor is where in the list of followers to start listing. When the page first loads, this is 0. As you scroll down, the following API calls increment this (101, 201, 301...)
count is how many results to return.
Since this is a GET call, you can open this URL in your browser.
POST /api/v3/users/[USERID]/followings
Follows a user (same as clicking their 'Follow' button).
Here [USERID] is ID of the user whose follower list will be updated (your own ID).
A payload must be sent that looks like this: {followee_uid: ID, source: 43}. I'm not sure what source is.
Also a CSRF header must be included.
Because this is a POST type call, it is not possible to open this type of URL directly in your browser.
DELETE /api/v3/users/[USERID]/followings
There is also an API to unfollow a user. (Just for reference).
If you call these API's from outside the browser, you probably need to send session cookies.
This script will list WEEDZAO's first 10 followers, then follow the first one from the list:
You must replace USERID and CSRF_TOKEN with your own values.
You can copy/paste this code into the browser dev console.
Alternatively, you can use this code from a web scraping framework like Puppeteer.
// Find these values in dev console "Network" tab:
var CSRF_TOKEN, USERID, USERID_TARGET, main;
USERID_TARGET = '41874362';
USERID = '12345678';
CSRF_TOKEN = 'MTYy...p8wg';
main = async function() {
var body, followers, json, options, payload, response, url;
// Get list of first 10 followers
console.log(url = `/api/v3/users/${USERID_TARGET}/followers?cursor=0&count=10`);
response = (await fetch(url));
json = (await response.json());
followers = json.follower_list;
console.table(followers);
// Follow first member from list above
console.log(url = `/api/v3/users/${USERID}/followings`);
payload = JSON.stringify({
followee_uid: followers[0].uid,
source: 43
});
response = (await fetch(url, options = {
method: 'POST',
body: payload,
headers: {
'X-CSRF-Token': CSRF_TOKEN
}
}));
body = (await response.text());
return console.log(body);
};
main();
It crashes because the interval is too fast
setInterval(function(){}, 10)
you are trying to call a scroll and click function every 10 milliseconds (that's 100 function call every 1 second). Which also interferes with the server as they fetch new users while scrolling.
Your script could work if you will adjust the interval to atleast 1000 milliseconds (1 second). Of course, it may take a while, but it will work. You should also expect that the page may become laggy specially when the page already loaded tons of users because Virtual Scrolling is not implemented in this page.
Even with slowing down the rate of the scrolling it still really bogs down the browser, the solution to this may be in the API the page contacts. To get the user's followers it contacts the site's V3 API
https://booyah.live/api/v3/users/41874362/followers?cursor=[LAST USER IN API RETURN]&count=100
to get all the users that would show up in the page. I wrote a script that can contact the api over and over again to get all the follower data, just run it in the page's console and use print() when you want to export the data
and copy/paste it into a .json file
//WARNING: THIS SCRIPT USES RECURSION, i have no clue how long the followers list goes so use at your own risk
var followers = []; //data collected from api
function getFollowers(cursor){
httpGet(`https://booyah.live/api/v3/users/41874362/followers?cursor=${cursor}&count=100`, function (data) { //returns data from API for given cursor (user at the end of last follower chunk)
console.log("got cursor: "+cursor);
var _followChunk = JSON.parse(String(data));
console.log(_followChunk)
followers.push(_followChunk.follower_list); //saves followers from chunk
var last_user = _followChunk.follower_list[_followChunk.follower_list.length - 1]; //gets last user of chunk (cursor for the next chunk)
setTimeout(function(){ //1 second timeout so that the API doesnt return "too many requests", not nessicary but you should probably leave this on
getFollowers(last_user.uid); //get next chunk
},1000)
})
}
var print = function(){console.log(JSON.stringify(followers))};
getFollowers(0); //get initial set of followers (cursor 0)
function httpGet(theUrl, callback) {
var xmlHttp = new XMLHttpRequest();
xmlHttp.open("GET", theUrl, false); // false for synchronous request
xmlHttp.setRequestHeader("Cache-Control", "no-store");
xmlHttp.send(null);
callback(xmlHttp.responseText);
};
if you really only need the button elements then the only way is to scroll all the way down for each time it loads new followers, as the page creates the elements as you scroll down
This is a fully working solution that I have tested in my own Chrome browser with a fresh account, successfully following all the follower accounts of the account you are targeting.
UPDATE (2021-06-18)
I've updated my solution to a drastically improved and faster function, rewritten with async/await. This new function reduces the estimated runtime from ~45min to ~10min. 10min is still a long while, but that's to be expected considering the large number of followers the user you are targeting has.
After a few iterations, the latest function not only improves speed, performance, and error reporting, but it also extends what is possible with the function. I provide several example below my solutions of how to use the function completely.
For the sake of de-cluttering my answer, I am removing my older function from this solution altogether, but you can still reference it in my solution's edit history if you like.
TL;DR
Here is the final, fastest, working solution. Make sure to replace PUT_YOUR_CSRF_TOKEN_HERE with your own CSRF token value. Detailed instructions on how to find your CSRF token are below.
You must run this in your console on the Booyah website in order to avoid CORS issues.
const csrf = 'PUT_YOUR_CSRF_TOKEN_HERE';
async function booyahGetAccounts(uid, type = 'followers', follow = 1) {
if (typeof uid !== 'undefined' && !isNaN(uid)) {
const loggedInUserID = window.localStorage?.loggedUID;
if (uid === 0) uid = loggedInUserID;
const unfollow = follow === -1;
if (unfollow) follow = 1;
if (loggedInUserID) {
if (csrf) {
async function getUserData(uid) {
const response = await fetch(`https://booyah.live/api/v3/users/${uid}`),
data = await response.json();
return data.user;
}
const loggedInUserData = await getUserData(loggedInUserID),
targetUserData = await getUserData(uid),
followUser = uid => fetch(`https://booyah.live/api/v3/users/${loggedInUserID}/followings`, { method: (unfollow ? 'DELETE' : 'POST'), headers: { 'X-CSRF-Token': csrf }, body: JSON.stringify({ followee_uid: uid, source: 43 }) }),
logSep = (data = '', usePad = 0) => typeof data === 'string' && usePad ? console.log((data ? data + ' ' : '').padEnd(50, '━')) : console.log('━'.repeat(50),data,'━'.repeat(50));
async function getList(uid, type, follow) {
const isLoggedInUser = uid === loggedInUserID;
if (isLoggedInUser && follow && !unfollow && type === 'followings') {
follow = 0;
console.warn('You alredy follow your followings. `follow` mode switched to `false`. Followings will be retrieved instead of followed.');
}
const userData = await getUserData(uid),
totalCount = userData[type.slice(0,-1)+'_count'] || 0,
totalCountStrLength = totalCount.toString().length;
if (totalCount) {
let userIDsLength = 0;
const userIDs = [],
nickname = userData.nickname,
nicknameStr = `${nickname ? ` of ${nickname}'s ${type}` : ''}`,
alreadyFollowedStr = uid => `User ID ${uid} already followed by ${loggedInUserData.nickname} (Account #${loggedInUserID})`;
async function followerFetch(cursor = 0) {
const fetched = [];
await fetch(`https://booyah.live/api/v3/users/${uid}/${type}?cursor=${cursor}&count=100`).then(res => res.json()).then(data => {
const list = data[type.slice(0,-1)+'_list'];
if (list?.length) fetched.push(...list.map(e => e.uid));
if (fetched.length) {
userIDs.push(...fetched);
userIDsLength += fetched.length;
if (follow) followUser(uid);
console.log(`${userIDsLength.toString().padStart(totalCountStrLength)} (${(userIDsLength / totalCount * 100).toFixed(4)}%)${nicknameStr} ${follow ? 'followed' : 'retrieved'}`);
if (fetched.length === 100) {
followerFetch(data.cursor);
} else {
console.log(`END REACHED. ${userIDsLength} accounts ${follow ? 'followed' : 'retrieved'}.`);
if (!follow) logSep(targetList);
}
}
});
}
await followerFetch();
return userIDs;
} else {
console.log(`This account has no ${type}.`);
}
}
logSep(`${follow ? 'Following' : 'Retrieving'} ${targetUserData.nickname}'s ${type}`, 1);
const targetList = await getList(uid, type, follow);
} else {
console.error('Missing CSRF token. Retrieve your CSRF token from the Network tab in your inspector by clicking into the Network tab item named "bug-report-claims" and then scrolling down in the associated details window to where you see "x-csrf-token". Copy its value and store it into a variable named "csrf" which this function will reference when you execute it.');
}
} else {
console.error('You do not appear to be logged in. Please log in and try again.');
}
} else {
console.error('UID not passed. Pass the UID of the profile you are targeting to this function.');
}
}
booyahGetAccounts(41874362);
Detailed explanation of the process
As the function runs, it logs the progress to the console, both how many users have been followed so far, and how much progress has been made percentage-wise, based on the total number of followers the profile you are targeting has.
Retrieving your CSRF token
The only manual portion of this process is retrieving your CSRF token. This is rather simple though. Once you log into Booyah, navigate to the Network tab of your Chrome console and click on the item named bug-report-claims, then scroll all the way down the details window which appears on the right. There should see x-csrf-token. Store this value as a string variable in your console as csrf, which my function will reference when it runs. This is necessary in order to use the POST method to follow users.
Here is what it will look like:
The solution
The function will loop through all users the account you are targeting follows in batches of 100 (the max amount allowed per GET request) and follow them all. When the end of each batch is met, the next batch is automatically triggered recursively.
🚀 Version 3 (Fastest and most flexible, using async/await and fetch())
My previous two solution versions (🐇 …🐢) can be referenced in this answer's edit history.
Make sure to replace PUT_YOUR_CSRF_TOKEN_HERE with your own CSRF token value. Detailed instructions on how to find your CSRF token are below.
You must run this in your console on the Booyah website in order to avoid CORS issues.
const csrf = 'PUT_YOUR_CSRF_TOKEN_HERE';
async function booyahGetAccounts(uid, type = 'followers', follow = 1) {
if (typeof uid !== 'undefined' && !isNaN(uid)) {
const loggedInUserID = window.localStorage?.loggedUID;
if (uid === 0) uid = loggedInUserID;
const unfollow = follow === -1;
if (unfollow) follow = 1;
if (loggedInUserID) {
if (csrf) {
async function getUserData(uid) {
const response = await fetch(`https://booyah.live/api/v3/users/${uid}`),
data = await response.json();
return data.user;
}
const loggedInUserData = await getUserData(loggedInUserID),
targetUserData = await getUserData(uid),
followUser = uid => fetch(`https://booyah.live/api/v3/users/${loggedInUserID}/followings`, { method: (unfollow ? 'DELETE' : 'POST'), headers: { 'X-CSRF-Token': csrf }, body: JSON.stringify({ followee_uid: uid, source: 43 }) }),
logSep = (data = '', usePad = 0) => typeof data === 'string' && usePad ? console.log((data ? data + ' ' : '').padEnd(50, '━')) : console.log('━'.repeat(50),data,'━'.repeat(50));
async function getList(uid, type, follow) {
const isLoggedInUser = uid === loggedInUserID;
if (isLoggedInUser && follow && !unfollow && type === 'followings') {
follow = 0;
console.warn('You alredy follow your followings. `follow` mode switched to `false`. Followings will be retrieved instead of followed.');
}
const userData = await getUserData(uid),
totalCount = userData[type.slice(0,-1)+'_count'] || 0,
totalCountStrLength = totalCount.toString().length;
if (totalCount) {
let userIDsLength = 0;
const userIDs = [],
nickname = userData.nickname,
nicknameStr = `${nickname ? ` of ${nickname}'s ${type}` : ''}`,
alreadyFollowedStr = uid => `User ID ${uid} already followed by ${loggedInUserData.nickname} (Account #${loggedInUserID})`;
async function followerFetch(cursor = 0) {
const fetched = [];
await fetch(`https://booyah.live/api/v3/users/${uid}/${type}?cursor=${cursor}&count=100`).then(res => res.json()).then(data => {
const list = data[type.slice(0,-1)+'_list'];
if (list?.length) fetched.push(...list.map(e => e.uid));
if (fetched.length) {
userIDs.push(...fetched);
userIDsLength += fetched.length;
if (follow) followUser(uid);
console.log(`${userIDsLength.toString().padStart(totalCountStrLength)} (${(userIDsLength / totalCount * 100).toFixed(4)}%)${nicknameStr} ${follow ? 'followed' : 'retrieved'}`);
if (fetched.length === 100) {
followerFetch(data.cursor);
} else {
console.log(`END REACHED. ${userIDsLength} accounts ${follow ? 'followed' : 'retrieved'}.`);
if (!follow) logSep(targetList);
}
}
});
}
await followerFetch();
return userIDs;
} else {
console.log(`This account has no ${type}.`);
}
}
logSep(`${follow ? 'Following' : 'Retrieving'} ${targetUserData.nickname}'s ${type}`, 1);
const targetList = await getList(uid, type, follow);
} else {
console.error('Missing CSRF token. Retrieve your CSRF token from the Network tab in your inspector by clicking into the Network tab item named "bug-report-claims" and then scrolling down in the associated details window to where you see "x-csrf-token". Copy its value and store it into a variable named "csrf" which this function will reference when you execute it.');
}
} else {
console.error('You do not appear to be logged in. Please log in and try again.');
}
} else {
console.error('UID not passed. Pass the UID of the profile you are targeting to this function.');
}
}
Usage
To run the function (for either of the above solutions), just call the function name with the desired User ID name as an argument, in your example case, 41874362. The function call would look like this:
booyahGetAccounts(41874362);
The function is quite flexible in its abilities though. booyahGetAccounts() accepts three parameters, but only the first is required.
booyahGetAccounts(
uid, // required, no default
type = 'followers', // optional, must be 'followers' or 'followings' -> default: 'followers'
follow = 1 // optional, must be 0, 1, or -1, -> default: 1 (boolean true)
)
The second parameter, type, allows you to choose whether you would like to process the targeted user's followers or followings (the users which that user follows).
The third parameter allows you to choose whether you would like to follow/unfollow the returned users or only retrieve their User IDs. This defaults to 1 (boolean true) which will follow the users returned, but if you only want to test the function and not actually follow the returned users, set this to a falsy value such as 0 or false. Using -1 will unfollow the users returned.
This function intelligently retrieves your own User ID for you from the window.localStorage object, so you don't need to retrieve that yourself. If you would like to process your own followers or followings, simply pass 0 as the main uid parameter value, and the function will default the uid to your own User ID.
Because you can't re-follow users you already follow, if you try to follow your followings, the function will produce the warning You already follow your followings. 'follow' mode switched to 'false'. Followings will be retrieved instead of followed. and instead return them as if you had set the follow parameter to false.
However, it can be very useful to process your own list. For example, if you want to follow all of your own followers back, you could do so like this:
booyahGetAccounts(0); // `type` and `follow` parameters already default to the correct values here
On the other hand, if you were strategically using a follow/unfollow technique in order to increase your number of followers and needed to unfollow all of your followers, you could do so like this:
booyahGetAccounts(0, 'followers', -1);
By setting the follow parameter value to -1, you instruct the function to run its followUser function on all returned User IDs using the DELETE method instead of the POST method, thereby unfollowing those users returned instead of following them.
Desired outcome
Function call
Follow all your own followers
booyahGetAccounts(0, 'followers');
Unfollow all your own followers
booyahGetAccounts(0, 'followers', -1);
Unfollow all your own followings
booyahGetAccounts(0, 'followings', -1);
Follow users that follow User ID #12345
booyahGetAccounts(12345, 'followers');
Follow users followed by User ID #12345
booyahGetAccounts(12345, 'followings');
Retrieve User IDs of accounts following User ID #12345
booyahGetAccounts(12345, 'followers', 0);
Retrieve User IDs of accounts followed by User ID #12345
booyahGetAccounts(12345, 'followings', 0);
Other notes
To improve the performance of this function, as it's very heavy, I've replaced all calls to userIDs.length with a dedicated userIDsLength variable which I add to using += with each iteration rather than calling length each time. Similarly, I store the length of the stringified followerCount in the variable followerCountStrLength rather than calling followerCount.toString().length with each iteration. Because this is a rather heavy function, it is possible for your browser window to crash. However, it should eventually complete.
If the page appears to crash by flickering and auto-closing the console, FIRST try to re-open the console without refreshing the page at all. In my case, the inspector occasionally closed on its own, likely due to the exhaustion from the function, but when I opened the inspector's console again, the function was still running.
I am new at using Firebase so all advice is welcome.
What i'm trying to achieve?
I want to create a player only if the user has not exceeded their team size ("numberOfplayersLimit").
So currently I am using a firebase transaction which first checks that the team has not exceeded their limit "numberOfPlayers", if the team has not exceeded their limit, increment the "numberOfplayersLimit" counter and then add the player to the database as shown below.
Whats my issue
I am currently using .push() to add the players however it is creating the player twice as shown below as you can see the full name is the same in other records but they have different uids.
Below is a screenshot from my Firebase real-time database JSON structure
var myUserId = firebase.auth().currentUser.uid;
const playerData = {
fullName:this.state.fullName,
};
//This is where the players are stored
const teamplayersref = firebase.database().ref('/teams').child(myUserId).child('/players')
//Transaction - Team reference path for the TeamPlayers Limit
const getTeamPlayersLimit = firebase.database().ref('/teams').child(myUserId).child('numberOfplayersLimit');
getTeamPlayersLimit.transaction(function(numberOfplayersLimit){
if (numberOfplayersLimit == 11) {
alert('You have exceed your team size limit, Delete a player from your team or contact us to upgrade your package');
}
else
{
//increment teamplayers limit
numberOfplayersLimit = numberOfplayersLimit + 1;
teamplayersref.push(playerData);
return numberOfplayersLimit;
}
});
The teamplayersref.push(playerData) call in your transaction handler is not part of the transaction itself. So if the transaction is retried, you end up calling teamplayersref.push(playerData) multiple times, creating a child node for each try.
To generate a new child with unique push ID, use push without an argument to get a new key, and then use that in the return value of your transaction. This means that your transaction will have to run on the entire firebase.database().ref('/teams') node, since you're modifying both the counter and the players.
const teamRef = firebase.database().ref('/teams').child(myUserId);
teamRef.transaction(function(team){
team = team || { numberOfplayersLimit: 0, players: {} };
if (team.numberOfplayersLimit == 11) {
console.error('You have exceed your team size limit, Delete a player from your team or contact us to upgrade your package');
}
else {
team.numberOfplayersLimit = team.numberOfplayersLimit + 1;
const newPlayerKey = teamRef.push().key; // this line does not write to the database
team.players[newPlayerKey] = playerData;
return team;
}
});
I am using TranscriptLoggerMiddleware and CosmosDB to log my chatbot transcripts. We are trying to capture the user state information (user name, account number, account type, etc) as top level attributes in the transcript so that specific customers can easily be queried in the DB (if that information is just in the individual timestamp attributes of the document, they can't be queried).
Ideally I would just add the user state when I'm building the file, but I can't figure any way to access it since the logger is defined in index.js and TranscriptLoggerMiddleware only provides the activity to my function, not the full context. If anyone has a way to get the user state data via TranscriptLoggerMiddleware, let me know, that would solve this issue. Here is the customLogger code. Note that due to the function receiving both the user query and bot response, I couldn't get retrieving and resaving the transcript to work, so I'm overwriting the transcript from a local log object. Not trying to come up with a new approach here but if one would solve the overall issue I'd like to hear it.
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.
const { CosmosDbPartitionedStorage } = require('botbuilder-azure');
const path = require('path');
/**
* CustomLogger, takes in an activity and saves it for the duration of the conversation, writing to an emulator compatible transcript file in the transcriptsPath folder.
*/
class CustomLogger {
/**
* Log an activity to the log file.
* #param activity Activity being logged.
*/
// Set up Cosmos Storage
constructor(appInsightsClient) {
this.transcriptStorage = new CosmosDbPartitionedStorage({
cosmosDbEndpoint: process.env.COSMOS_SERVICE_ENDPOINT,
authKey: process.env.COSMOS_AUTH_KEY,
databaseId: process.env.DATABASE,
containerId: 'bot-transcripts'
});
this.conversationLogger = {};
this.appInsightsClient = appInsightsClient;
this.msDelay = 250;
}
async logActivity(activity) {
if (!activity) {
throw new Error('Activity is required.');
}
// Log only if this is type message
if (activity.type === 'message') {
if (activity.attachments) {
try {
var logTextDb = `${activity.from.name}: ${activity.attachments[0].content.text}`;
} catch (err) {
var logTextDb = `${activity.from.name}: ${activity.text}`;
}
} else {
var logTextDb = `${activity.from.name}: ${activity.text}`;
}
if (activity.conversation) {
var id = activity.conversation.id;
if (id.indexOf('|') !== -1) {
id = activity.conversation.id.replace(/\|.*/, '');
}
// Get today's date for datestamp
var currentDate = new Date();
var day = currentDate.getDate();
var month = currentDate.getMonth()+1;
var year = currentDate.getFullYear();
var datestamp = year + '-' + month + '-' + day;
var fileName = `${datestamp}_${id}`;
var timestamp = Math.floor(Date.now()/1);
// CosmosDB logging (JK)
if (!(fileName in this.conversationLogger)) {
this.conversationLogger[fileName] = {};
this.conversationLogger[fileName]['userData'] = {};
this.conversationLogger[fileName]['botName'] = process.env.BOTNAME;
}
this.conversationLogger[fileName][timestamp] = logTextDb;
let updateObj = {
[fileName]:{
...this.conversationLogger[fileName]
}
}
// Add delay to ensure messages logged sequentially
await this.wait(this.msDelay);
try {
let result = await this.transcriptStorage.write(updateObj);
} catch(err) {
console.log(err);
this.appInsightsClient.trackTrace({message: `Logger Error ${err.code} - ${path.basename(__filename)}`,severity: 3,properties: {'botName': process.env.BOTNAME, 'error':err.body}});
}
}
}
}
async wait(milliseconds) {
var start = new Date().getTime();
for (var i = 0; i < 1e7; i++) {
if ((new Date().getTime() - start) > milliseconds) {
break;
}
}
}
}
exports.CustomLogger = CustomLogger;
Not being able to get user state in this function, I decided to try a few other approaches. The most promising was creating a separate "updateTranscript" function to grab the transcript, add user state, and save it back. But I think it was catching it only on user request and getting overidden again by local object on bot response. I added a delay to try to combat this, but it still didn't work. On my very first prompt of providing customer number user state data is getting stored on transcript, but at the next activity it is gone and never comes back (even though I can see it is supposedly getting written to DB). Here is that update function.
const { CosmosDbStorage } = require('botbuilder-azure');
var updateTranscript = async (context, userData, appInsightsClient) => {
const transcriptStorage = new CosmosDbStorage({
serviceEndpoint: process.env.COSMOS_SERVICE_ENDPOINT,
authKey: process.env.COSMOS_AUTH_KEY,
databaseId: process.env.DATABASE,
collectionId: 'bot-transcripts',
partitionKey: process.env.BOTNAME
});
var id = context.activity.conversation.id;
if (id.indexOf('|') !== -1) {
id = context.activity.conversation.id.replace(/\|.*/, '');
}
// Get today's date for datestamp
var currentDate = new Date();
var day = currentDate.getDate();
var month = currentDate.getMonth()+1;
var year = currentDate.getFullYear();
var datestamp = year + '-' + month + '-' + day;
var filename = `${datestamp}_${id}`;
var msDelay = 500;
await new Promise(resolve => setTimeout(resolve, msDelay));
var transcript = await transcriptStorage.read([filename]);
transcript[filename]['userData'] = userData
try {
await transcriptStorage.write(transcript);
console.log('User data added to transcript');
} catch(err) {
console.log(err);
appInsightsClient.trackTrace({message: `Log Updater Error ${err.code} - ${path.basename(__filename)}`,severity: 3,properties: {'botName': process.env.BOTNAME, 'error':err.body}});
}
return;
}
module.exports.updateTranscript = updateTranscript
I realize this approach is a bit of a cluster but I've been unable to find anything better. I know the Microsoft COVID-19 bot has a really nice transcript retrieval function, but I haven't been able to get any input from them on how that was accomplished. That aside, I'm quite happy to continue with this implementation if someone can help me figure out how to get that user state into the transcript without being overwritten or running into concurrency issues.
As to why I can't query an account number even via substring() function, here's an example of the documents data object. I have no idea which string to check for a substring, in this case 122809. I don't know what that timestamp could be. If this is stored at the top level (e.g. userData/accountNumber) I know exactly where to look for the value. For further context, I've displayed what I see after the first prompt for account number, where userData is populated. But it gets overidden on subsequent writes and I can't seem to get it back even with a delay in my updateTranscript function.
"document": {
"userData": {},
"botName": "AveryCreek_OEM_CSC_Bot_QA",
"1594745997562": "AveryCreek_OEM_CSC_Bot_QA: Hi! I'm the OEM CSC Support Bot! Before we get started, can you please provide me with your 6-digit Vista number? If you don't have one, just type \"Skip\".",
"1594746003973": "You: 122809",
"1594746004241": "AveryCreek_OEM_CSC_Bot_QA: Thank you. What can I help you with today? \r\nYou can say **Menu** for a list of common commands, **Help** for chatbot tips, or choose one of the frequent actions below. \r\n \r\n I'm still being tested, so please use our [Feedback Form](https://forms.office.com/Pages/ResponsePage.aspx?id=lVxS1ga5GkO5Jum1G6Q8xHnUJxcBMMdAqVUeyOmrhgBUNFI3VEhMU1laV1YwMUdFTkhYVzcwWk9DMiQlQCN0PWcu) to let us know how well I'm doing and how I can be improved!",
"1594746011384": "You: what is my account number?",
"1594746011652": "AveryCreek_OEM_CSC_Bot_QA: Here is the informaiton I have stored: \n \n**Account Number:** 122809 \n\n I will forget everything except your account number after the end of this conversation.",
"1594746011920": "AveryCreek_OEM_CSC_Bot_QA: I can clear your information if you don't want me to store it or if you want to reneter it. Would you like me to clear your information now?",
"1594746016034": "You: no",
"1594746016301": "AveryCreek_OEM_CSC_Bot_QA: OK, I won't clear your information. You can ask again at any time."
},
"document": {
"userData": {
"accountNumber": "122809"
},
"botName": "AveryCreek_OEM_CSC_Bot_QA",
"1594746019952": "AveryCreek_OEM_CSC_Bot_QA: Hi! I'm the OEM CSC Support Bot! What can I help you with today? \r\nYou can say **Menu** for a list of common commands, **Help** for chatbot tips, or choose one of the frequent actions below. \r\n \r\n I'm still being tested, so please use our [Feedback Form](https://forms.office.com/Pages/ResponsePage.aspx?id=lVxS1ga5GkO5Jum1G6Q8xHnUJxcBMMdAqVUeyOmrhgBUNFI3VEhMU1laV1YwMUdFTkhYVzcwWk9DMiQlQCN0PWcu) to let us know how well I'm doing and how I can be improved!"
},
You had said you were encountering concurrency issues even though JavaScript is single-threaded. As strange as that sounds, I think you're right on some level. TranscriptLoggerMiddleware does have its own buffer that it uses to store activities throughout the turn and then it tries to log all of them all at once. It could easily have provided a way to get that whole buffer in your own logger function, but instead it just loops through the buffer so that you still only get to log them each individually. Also, it allows logActivity to return a promise but it never awaits it, so each activity will get logged "simultaneously" (it's not really simultaneous but the code will likely jump between function calls before waiting for them to complete). This is a problem for any operation that isn't atomic, because you'll be modifying state without knowing about its latest modifications.
while (transcript.length > 0) {
try {
const activity: Activity = transcript.shift();
// If the implementation of this.logger.logActivity() is asynchronous, we don't
// await it as to not block processing of activities.
// Because TranscriptLogger.logActivity() returns void or Promise<void>, we capture
// the result and see if it is a Promise.
const logActivityResult = this.logger.logActivity(activity);
// If this.logger.logActivity() returns a Promise, a catch is added in case there
// is no innate error handling in the method. This catch prevents
// UnhandledPromiseRejectionWarnings from being thrown and prints the error to the
// console.
if (logActivityResult instanceof Promise) {
logActivityResult.catch(err => {
this.transcriptLoggerErrorHandler(err);
});
}
} catch (err) {
this.transcriptLoggerErrorHandler(err);
}
}
All in all, I don't think transcript logger middleware is the way to go here. While it may purport to serve your purposes, there are just too many problems with it. I would either write my own middleware or just put the middleware code directly in my bot logic like this:
async onTurn(turnContext) {
const activity = turnContext.activity;
await this.logActivity(turnContext, activity);
turnContext.onSendActivities(async (ctx, activities, next) => {
for (const activity of activities) {
await this.logActivity(ctx, activity);
}
return await next();
});
// Bot code here
// Save state changes
await this.userState.saveChanges(turnContext);
}
async logActivity(turnContext, activity) {
var transcript = await this.transcriptProperty.get(turnContext, []);
transcript.push(activity);
await this.transcriptProperty.set(turnContext, transcript);
console.log('Activities saved: ' + transcript.length);
}
Since your transcript would be stored in your user state, that user state would also have the account number you need and hopefully you'd be able to query for it.
Kyle's answer did help me solve the issue, and I think that will be the most reusable piece for anyone experiencing similar issues. The key takeaway is that, if you're using nodejs, you should not be using TranscriptLoggerMiddleware and instead use Kyle's function in your onTurn handler (repeated here for reference):
// Function provided by Kyle Delaney
async onTurn(turnContext) {
const activity = turnContext.activity;
await this.logActivity(turnContext, activity);
turnContext.onSendActivities(async (ctx, activities, next) => {
for (const activity of activities) {
await this.logActivity(ctx, activity);
}
return await next();
});
// Bot code here
// Save state changes
await this.userState.saveChanges(turnContext);
}
You need to note, though, that his logActivity function is just storing the raw activities to the user state using a custom transcriptProperty. As of yet I haven't found a good method to give business/admin users access to this data in a way that is easily readable and searchable, nor construct some sort of file out output to send to a customer requesting a transcript of their conversation. As such, I continued using my CustomLogger instead. Here is how I accomplished that.
First, you must create the transcriptLogger in the constructor. If you create it inside your turn handler, you will lose the cache/buffer and it will only have the latest activity instead of the full history. May be common sense but this tripped me up briefly. I do this in the constructor via this.transcriptLogger = new CustomerLogger(appInsightsClient);. I also modified my logActivity function to accept the userData (my state object) as a second, optional parameter. I have successfully been able to use that userData object to add the required customer information to the bot transcript. To modify Kyle's function above you just need to replace this.logActivity with your function call, in my case this.transcriptLogger.logActivity(context, userData);.
While there are still some other issues with this approach, it does solve the title question of how to get user state data into the transcript.
I have a data like this:
"customers": {
"aHh4OTQ2NTlAa2xvYXAuY29t": {
"customerId": "xxx",
"name": "yyy",
"subscription": "zzz"
}
}
I need to retrive a customer by customerId. The parent key is just B64 encoded mail address due to path limitations. Usually I am querying data by this email address, but for a few occasions I know only customerId. I've tried this:
getCustomersRef()
.orderByChild('customerId')
.equalTo(customerId)
.limitToFirst(1)
.once('child_added', cb);
This works nicely in case the customer really exists. In opposite case the callback is never called.
I tried value event which works, but that gives me whole tree starting with encoded email address so I cannot reach the actual data inside. Or can I?
I have found this answer Test if a data exist in Firebase, but that again assumes that you I know all path elements.
getCustomersRef().once('value', (snapshot) => {
snapshot.hasChild(`customerId/${customerId}`);
});
What else I can do here ?
Update
I think I found solution, but it doesn't feel right.
let found = null;
snapshot.forEach((childSnapshot) => {
found = childSnapshot.val();
});
return found;
old; misunderstood the question :
If you know the "endcodedB64Email", this is the way.:
var endcodedB64Email = B64_encoded_mail_address;
firebase.database().ref(`customers/${endcodedB64Email}`).once("value").then(snapshot => {
// this is getting your customerId/uid. Remember to set your rules up for your database for security! Check out tutorials on YouTube/Firebase's channel.
var uid = snapshot.val().customerId;
console.log(uid) // would return 'xxx' from looking at your database
// you want to check with '.hasChild()'? If you type in e.g. 'snapshot.hasChild(`customerId`)' then this would return true, because 'customerId' exists in your database if I am not wrong ...
});
UPDATE (correction) :
We have to know at least one key. So if you under some circumstances
only know the customer-uid-key, then I would do it like this.:
// this is the customer-uid-key that is know.
var uid = firebase.auth().currentUser.uid; // this fetches the user-id, referring to the current user logged in with the firebase-login-function
// this is the "B64EmailKey" that we will find if there is a match in the firebase-database
var B64EmailUserKey = undefined;
// "take a picture" of alle the values under the key "customers" in the Firebase database-JSON-object
firebase.database().ref("customers").once("value").then(snapshot => {
// this counter-variable is used to know when the last key in the "customers"-object is passed
var i = 0;
// run a loop on all values under "customers". "B64EmailKey" is a parameter. This parameter stores data; in this case the value for the current "snapshot"-value getting caught
snapshot.forEach(B64EmailKey => {
// increase the counter by 1 every time a new key is run
i++;
// this variable defines the value (an object in this case)
var B64EmailKey_value = B64EmailKey.val();
// if there is a match for "customerId" under any of the "B64EmailKey"-keys, then we have found the corresponding correct email linked to that uid
if (B64EmailKey_value.customerId === uid) {
// save the "B64EmailKey"-value/key and quit the function
B64EmailUserKey = B64EmailKey_value.customerId;
return B64UserKeyAction(B64EmailUserKey)
}
// if no linked "B64EmailUserKey" was found to the "uid"
if (i === Object.keys(snapshot).length) {
// the last key (B64EmailKey) under "customers" was returned. e.g. no "B64EmailUserKey" linkage to the "uid" was found
return console.log("Could not find an email linked to your account.")
}
});
});
// run your corresponding actions here
function B64UserKeyAction (emailEncrypted) {
return console.log(`The email-key for user: ${auth.currentUser.uid} is ${emailEncrypted}`)
}
I recommend putting this in a function or class, so you can easily call it up and reuse the code in an organized way.
I also want to add that the rules for your firebase must be defined to make everything secure. And if sensitive data must be calculated (e.g. price), then do this on server-side of Firebase! Use Cloud Functions. This is new for Firebase 2017.
So I'm using node.js and the module instagram-node-lib to download metadata for Instagram posts. I have a couple of hashtags that I want to search for, and I want to download all existing posts (handling request failure during pagination) as well as monitor all new posts.
I have managed to crack the first part - downloading all existing posts and handling failure (I noticed that sometimes the Instagram API would just fail on me, so I've added redundancy to remember the last successful page I downloaded and attempt again from that point). For anyone who is interested, here is my code (note, I use Postgres to save the posts, and I've abbreviated/obfuscated some of the code for ease of reading and for commercial purposes) **apologies for the length of code, but I think this will come in useful to someone:
var db = new (require('./postgres'))
,api = require("instagram-node-lib")
;
var HASHTAGS = ["fluffy", "kittens"] //this is just an example!
,CLIENT_ID = "YOUR_CLIENT_ID"
,CLIENT_SECRET = "YOUR_CLIENT_SECRET"
,HOST = "https://api.instagram.com"
,PORT = 443
,PATH = "/v1/media/popular?client_id=" + CLIENT_ID
;
var hashtagIndex = 0
,settings
;
/**
* Initialise the module for use
*/
exports.initialise = function(){
api.set("client_id", CLIENT_ID);
api.set("client_secret", CLIENT_SECRET);
if( !settings){
settings = {
hashtags: []
}
for( var i in HASHTAGS){
settings.hashtags[i] = {
name: HASHTAGS[i],
maxTagId: null,
minTagId: null,
nextMaxTagId: null,
}
}
}
// console.log(settings);
db.initialiseSettings(); //I haven't included the code for this - basically just loads settings from the database, overwriting the defaults above if they exist, otherwise it creates them using the above object. I store the settings as a JSON object in the DB and parse them on load
execute();
}
function execute(){
var params = {
name: HASHTAGS[hashtagIndex],
complete: function(data, pagination){
var hashtag = settings.hashtags[hashtagIndex];
//from scratch
if( !hashtag.maxTagId){
console.log('Downloading old posts from scratch');
getOldPosts();
}
//still loading old (previously failed)
else if( hashtag.nextMaxTagId){
console.log('Downloading old posts from last saved position');
getOldPosts(hashtag.nextMaxTagId);
}
//new posts only
else {
console.log('Downloading new posts only');
getNewPosts(hashtag.minTagId);
}
},
error: function(msg, obj, caller){
apiError(msg, obj, caller);
}
}
api.tags.info(params);
}
function getOldPosts(maxTagId){
console.log();
var params = {
name: HASHTAGS[hashtagIndex],
count: 100,
max_tag_id: maxTagId || undefined,
complete: function(data, pagination){
console.log(pagination);
var hashtag = settings.hashtags[hashtagIndex];
//reached the end
if( pagination.next_max_tag_id == hashtag.maxTagId){
console.log('Downloaded all posts for #' + HASHTAGS[hashtagIndex]);
hashtag.nextMaxTagId = null; //reset nextMaxTagId - that way next time we execute the script we know to just look for new posts
saveSettings(function(){
next();
}); //Another function I haven't include - just saves the settings object, overwriting what is in the database. Once saved, executes the next() function
}
else {
//from scratch
if( !hashtag.maxTagId){
//these values will be saved once all posts in this batch have been saved. We set these only once, meaning that we have a baseline to compare to - enabling us to determine if we have reached the end of pagination
hashtag.maxTagId = pagination.next_max_tag_id;
hashtag.minTagId = pagination.min_tag_id;
}
//if there is a failure then we know where to start from - this is only saved to the database once the posts are successfully saved to database
hashtag.nextMaxTagId = pagination.next_max_tag_id;
//again, another function not included. saves the posts to database, then updates the settings. Once they have completed we get the next page of data
db.savePosts(data, function(){
saveSettings(function(){
getOldPosts(hashtag.nextMaxTagId);
});
});
}
},
error: function(msg, obj, caller){
apiError(msg, obj, caller);
//keep calm and try again - this is our failure redundancy
execute();
}
}
var posts = api.tags.recent(params);
}
/**
* Still to be completed!
*/
function getNewPosts(minTagId){
}
function next(){
if( hashtagIndex < HASHTAGS.length - 1){
console.log("Moving onto the next hashtag...");
hashtagIndex++;
execute();
}
else {
console.log("All hashtags processed...");
}
}
Ok so here is my dilema about solving the next piece of the puzzle - downloading new posts (in other words, only those new posts that have come into existence since I last downloaded all the posts). Should I use Instagram subscriptions or is there a way to implement paging similar to what I've already used? I'm worried that if I use the former solution then if there is a problem with my server and it goes down for a period of time then I will miss out on some posts. I' worried that if I use the latter solution then it might not be possible to page through the records, because is the Instagram API set up to enable forward paging rather than backward paging?
I've attempted to post questions in the Google Instagram API Developers Group a couple of times and none of my messages seem to be appearing in the forum so I thought I'd resort to trusty stackoverflow