How to update data at big query upon onUpdate event - javascript

I'm importing data from firebase to big query which is working fine at onWrite event and used table.insert function.
Now I want to update data at big query against onUpdate event but table.update function is not available and not working.suggest some other way.
below is my code
exports.updatetobigquery =
functions.database.ref('/mn_users/{userId}/').onUpdate(event => {
const dataset =
bigquery.dataset('KHUSHUApp');//functions.config().bigquery.datasetname);
const table =
dataset.table('mn_users');//functions.config().bigquery.tablename);
console.log('Uppercasing', event.data.val());
return table.update({
'id': event.data.key,
'name': event.data.val().name,
'email': event.data.val().email
});
});

Best way to handle this:
BigQuery has the ability to UPDATE data, but BigQuery is an analytical database - not a database optimized for updates.
So avoid updating data in BigQuery, if you have other means of achieving your goals.
Instead of updates, send new rows to BigQuery - de-duplicate and merge latest values later when analyzing. This pattern is great when you need the ability to go back to each time the state changed, and analyze that.

Related

Angular: Increase Query Loading Time in Firebase Database

I have an angular app where i am querying my firebase database as below:
constructor() {
this.getData();
}
getData() {
this.projectSubscription$ = this.dataService.getAllProjects()
.pipe(
map((projects: any) =>
projects.map(sc=> ({ key: sc.key, ...sc.payload.val() }))
),
switchMap(appUsers => this.dataService.getAllAppUsers()
.pipe(
map((admins: any) =>
appUsers.map(proj =>{
const match: any = admins.find(admin => admin.key === proj.admin);
return {...proj, imgArr: this.mapObjectToArray(proj.images), adminUser: match.payload.val()}
})
)
)
)
).subscribe(res => {
this.loadingState = false;
this.projects = res.reverse();
});
}
mapObjectToArray = (obj: any) => {
const mappedDatas = [];
for (const key in obj) {
if (Object.prototype.hasOwnProperty.call(obj, key)) {
mappedDatas.push({ ...obj[key], id: key });
}
}
return mappedDatas;
};
And here is what I am querying inside dataService:
getAllProjects() {
return this.afDatabase.list('/projects/', ref=>ref.orderByChild('createdAt')).snapshotChanges();
}
getAllAppUsers() {
return this.afDatabase.list('/appUsers/', ref=>ref.orderByChild('name')).snapshotChanges();
}
The problem I am facing with this is I have 400 rows of data which I am trying to load and it is taking around 30seconds to load which is insanely high. Any idea how can I query this in a faster time?
We have no way to know whether the 30s is reasonable, as that depends on the amount of data loaded, the connection latency and bandwidth of the client, and more factors we can't know/control.
But one thing to keep in mind is that you're performing 400 queries to get the users of each individual app, which is likely not great for performance.
Things you could consider:
Pre-load all the users once, and then use that list for each project.
Duplicate the name of each user into each project, so that you don't need to join any data at all.
If you come from a background in relational databases the latter may be counterintuitive, but it is actually very common in NoSQL data modeling and is one of the reasons NoSQL databases scale so well.
I propose 3 solutions.
1. Pagination
Instead of returning all those documents on app load, limit them to just 10 and keep record of the last one. Then display the 10 (or any arbitrary base number)
Then make the UI in such a way that the user has to click next or when the user scrolls, you fetch the next set based on the previous last document's field's info.
I'm supposing you need to display all the fetched data in some table or list so having the UI paginate the data should make sense.
2. Loader
Show some loader UI on website load. Then when all the documents have fetched, you hide the loader and show the data as you want. You can use some custom stuff for loader, or choose from any of the abundant libraries out there, or use mat-progress-spinner from Angular Material
3. onCall Cloud Function
What if you try getting them through an onCall cloud function? It night be faster because it's just one request that the app will make and Firebase's Cloud Functions are very fast within Google's data centers.
Given that the user's network might be slow to iterate the documents but the cloud function will return all at once and that might give you what you want.
I guess you could go for this option only if you really really need to display all that data at once on website load.
... Note on cost
Fetching 400 or more documents every time a given website loads might be expensive. It'll be expensive if the website is visited very frequently by very many users. Firebase cost will increase as you are charged per document read too.
Check to see if you could optimise the data structure to avoid fetching this much.
This doesn't apply to you if this some admin dashboard or if fetching all users like this is done rarely making cost to not be high in that case.

Firebase Realtime Databse difference between data snapshots

If I have a database structure like here and I make a query as shown below.Is there a difference on the traffic used to download the snapshot from the database if I access each node with snapshot.forEach(function(childSnapshot) and if I don't access the nodes?
If there is no difference, is there a way to access only the keys in Chats without getting a snapshot data for what each key contains.I'm assuming that this way it will generate less downloaded data
var requests = db.ref("Chats");
requests.on('child_added', function(snapshot) {
var communicationId = snapshot.key;
console.log("Chat id = " + communicationId);
getMessageInfo(
communicationId,
function() {
snapshot.ref.remove();
}
);
When you call requests.on('child_added', ...), you are always going to access all of the data at the requests node. It doesn't matter what you do in the callback function. The entire node is loaded into memory, and cost of the query is paid. What you do with the snapshot in memory doesn't cost anything else.
If you don't want all of the child nodes under requests, you should find some way to filter the query for only the children you need.
As they mentioned in the documentation, either of these methods can be used:
Call a method to get the data.
Set a listener to receive data-change
events.
Traffic depends upon our usage. When your data need not get updated in realtime, you can just call a method to get the data (1) But if you want your data to be updated in realtime, then you should go for (2). When you set a listener, Firebase sends your listener an initial snapshot of the data, and then another snapshot each time the child changes.
(1) - Example
firebase.database().ref('/users/').once('value') // Single Call
(2) - Example
firebase.database().ref('/users/').on('child_added') // Every Update It is Called
And also, I think you cannot get all keys, because when you reference a child and retrieve a data, firebase itself sends it as key-value pairs (DataSnapshot).
Further Reference: https://firebase.google.com/docs/reference/js/firebase.database.DataSnapshot
https://firebase.google.com/docs/database/web/read-and-write

Difference between jsforce bulk api options

I'm using jsforce to access salesforce using the bulk api. It has two ways of updating and deleting records. One is using the normal bulk api which means creating a job and batches:
var job = conn.bulk.createJob("Account", "delete");
var batch = job.createBatch();
var accounts = getAccountsByDate(jsforce.Date.TODAY);
batch.execute(accounts);
batch.on('response', function(rets) {
// do things
});
The other way is to the "query" interface like this:
conn.sobject('Account')
.find({ CreatedDate: jsforce.Date.TODAY })
.destroy(function(err, rets) {
// do things
});
The second way certainly seems easier but I can't get it to update or delete more than 10,000 records at a time, which appears to be a salesforce api limit on batch size. Note that using maxFetch property from jsforce appears to have no effect in this case.
So is it safe to assume that the query style interface only creates a single batch? The jsforce documentation is not clear on this point.
Currently the bulk.load() method in JSforce bulk api generates a job with one batch, so the limit of 10,000 per batch will be applied. It is also true when using find-and-destroy interface, which uses bulk.load() internally.
To avoid this limit you can create a job by bulk.createJob() and create several batches by job.createBatch(), then dispatch the records to delete into these batches so that each records will not exceed the limit.

How to get data from multiple url's in Reactjs?

I want to get json data from multiple url's and display it on frontend.
Following are the url's:
1) localhost:3000/api/getdata1
2) localhost:3000/api/getdata2
3) localhost:3000/api/getdata3
Instead of using .fetch() on each of the url's like below:
.fetch('localhost:3000/api/getdata1')
.fetch('localhost:3000/api/getdata2')
.fetch('localhost:3000/api/getdata3')
Can this be done in more efficent way in ReactJs ?
I was trying:
const dataurls = [
'localhost:3000/api/getdata1',
'localhost:3000/api/getdata2',
'localhost:3000/api/getdata3'
];
const promisedurl = dataurls.map(httpGet);
Promise.all(promisedurls)
.then(data=> {
for (const d of data) {
console.log(d);
}
})
.catch(reason => {
// Receives first rejection among the Promises
});
Please suggest which one should be used or is there any efficient way to do get data from multiple url's.
ReactJS is a View layer library. It has nothing to do with how you aquire any data from server.
Even state libraries, like Redux and Reflux do not implement any method of fetching data. In most cases you do that in your custom app code. Sometimes using extra libraries (e.g. Redux middlewares).
So, yes: your Promise.all(<several https requests here>) is the most natural way to achieve that.

Firebase get all values of specific key

I have a users table on Firebase and each user has an email prop.
Structure looks like:
Users -> User UID (looks like n8haBbjgablobA2ranfuabu3aaaga2af) -> User Obj which includes email prop.
I'd like to get an array of all the users' emails (~1m).
How can I most efficiently do this?
Ps.:
I tried:
usersRef.startAt(0).endAt(20).once("value", function(snapshot) {
console.log('FIRST 20');
console.log(snapshot.val()); // null
});
But that fails.
Probably the most efficient approach in terms of data reads would be to denormalize your data. You could store the email addresses both in the individual user nodes and in an emailAddresses node. Then you could just query the emailAddresses node directly for your list of emails.
Still ~1m email address nodes would probably be too much all at once. I'd probably grab it in chunks... I'm guessing.
Update
"Grabbing in chunks" is essentially pagination. I would try to use something off the shelf before trying to roll my own pagination solution.
Pagination libraries to check out:
Firebase Utils Pagination: This is developed by Firebase, but they say it is experimental and not ready for production use. But, it's probably still worth messing around with.
firebase-paginator: this is developed by a community member and it seems pretty solid.
If you want to roll your own pagination, check out:
#kato's response in this StackOverflow answer He makes an interesting point about the potential problem with paginating a real time data set, but then provides some good starter code
Here's a good blog entry that talks about the code that I think is a part of the firebase-paginator library I linked to above
Everybody in their answers said that it was an easy thing, yet had no working solutions. Here's what I came up with:
usersRef.orderByChild('uid').limitToFirst(100).once('value', function (snapshot) {
var users = snapshot.val()
var uids = Object.keys(users);
var lastUid = uids[uids.length - 1];
// could be another property than uid, for example email, or username. Ps.: Consider that the last ID from the previous chunk will be duplicated.
usersRef.orderByChild('uid').startAt(lastUid).limitToFirst(100).once('value', function (snapshot) {
var users = snapshot.val()
var uids = Object.keys(users);
console.log(uids);
var lastUid = uids[uids.length - 1];
// re-run function until done
})
})
Since this is a one-time deal, an option is to simply iterate over each node in the parent 'data' node capturing the child data, stripping out the email address and dumping that to a file.
the event you want is
child_added: retrieve lists of items or listen for additions to a list
of items. This event is triggered once for each existing child and
then again every time a new child is added to the specified path. The
listener is passed a snapshot containing the new child's data.
and the code to iterate all of the child nodes in the data node is
var dataRef = firebase.database().ref('myRootRef/data');
datRef.on('child_added', function(data) {
//data.val() will contain the child data, such as the email address
//append it to a text file here (for example), save to disk etc.
});
The key here is that this event is triggered once for each child, similar to iterating over all of the indexes in an array.
This will retrieve each child and present it to your app, one child at a time, iterating over all the children within the node.
It's going to take a while with that many nodes to chew through.

Categories

Resources