I would like a suggestion.
I'm developing a web app using react-js as frontend and firebase as backend.
For sake of semplicity, say the aim of the web-app is to allow the user to upload some items which are then displayed on the homepage
An item consists of two string: 1) name of item 2) url pointed to an image stored on IPFS
Once the user submit a form, the frontend creates a new doc in a collection in firestore:
await addDoc(collectionRef, newDoc)
Then, in the homepage, I display all the items in the collection
const querySnapshot = await getDocs(collectionRef);
querySnapshot.forEach((doc) => {
...
});
This works; however, if I've understood properly, every time the homepage is refreshed the frontend makes N read calls (where N is the number of docs in the collection). Therefore, I'm wondering if its the right approach, is there a better way?
I really appreaciate any suggestion (also regarding potential major safety flaws in this setting)
Every time the homepage is refreshed the frontend makes N read calls
(where N is the number of docs in the collection).
With const querySnapshot = await getDocs(collectionRef); this is indeed the case: all the docs in the collection are fetched.
If your collection contains a lot of documents and is fetched frequently this will cost you money and, at one moment, the performance of you app may be degraded (duration of the Firestore query execution and lot of docs transferred from the back-end to the front-end).
One approach to avoid that is to use pagination and fetch only the first X documents and display a link to fetch the X next ones, etc.
There is a page in the Firestore documentation dedicated to pagination implementation.
You could also implement an infinite scroll mechanism instead of a link, to load the next set of X document continuously as the user scrolls down the page. That's just a UI variation and is based on the exact same approach detailed in the Firestore documentation.
Also regarding potential major safety flaws in this setting
Fetching all the docs of a collection or using pagination does not make any difference in terms of security. In any case you should implement security according to your requirements by using security rules (and potentially Firebase Auth as well). And don't forget that security rules are not filters.
Related
I am currently fetching data from a 3rd party api in a React application.
The way the api is structured, the flow goes like this:
// Get a list of data
response = GET /reviews
while response.data.nextPageUrl is not null
response = GET ${response.data.nextPageUrl}
I currently have this logic inside of a "fetch method"
const fetchMyData = () => {
// The logic above
}
const query = useQuery(['myKey'], () => fetchMyData, {...})
This works, but comes with some drawbacks.
There is no way to see the progress of the individual requests (to be used in a progress bar)
It cannot leverage React Query's caching functionality on an individual request.
For example: If there are 10 requests to be made, but during the 5th request the user refreshes the page, they will need to restart from the beginning.
I am aware of useQueries but since I do not know the query urls until I have received the first request I do not see how I could use it.
Is there a better approach or way to satisfy the 2 issues above?
Thank you in advance
What you need is Infinite Queries. useInfiniteQuery works similar to useQuery but it's made for fetching by chunks. useInfiniteQuery has extra parameters like getNextPageParam which you will use to tell React Query how to determine the next page. This hook also returns fetchNextPage, hasNextPage (among others), which will allow you to implement patterns like a "show more" button that appears for as long as there is a next page.
By using infinite queries you still leverage on React Query caching. Check the docs for more info.
So in my web application, I am fetching data from firebase, it will have about 5000 documents by the time it's ready. So to reduce the number of reads, I keep an array in firebase with a list of ids of each documents, which I read and then compare with the list I have saved locally. If an id is missing, I read that particular doc from firebase (Firebase website definitely slowed down with having an array with 3000 docs so far).
However, consider the situation where I have 2000+ docs missing and I have to fetch them one by one according to missingList I created from comparing the firebase and local arrays. It is heck a lot of slow, it takes so long to fetch even a hundred documents because I am using await. If I don't, then firebase is overloaded with requests and for some reason it shows a warning that my net is disconnected.
What is the best way to fetch documents with ids given in an array, with no loss and reasonably fast enough? Is there a way for me to batch read documents like there is batch write, update and set?
Say Firebase has [1234, 1235, 1236, 1237] and I only have [1234] so I need to read [1235, 1236, 1237] from it.
I am using a for loop to iterate through each id and get the corresponding document like so
for (let i = 0; i < missingList.length; i++) {
var item = await db.collection('myCollection').doc(missingList[i]).get().then((snapshot) => snapshot.data())
await saveToDB(item) //I use PouchDB for local storage
}
The closest thing to a 'batched read' in Firestore is a transaction, but using a transation to read hundreds of document is not really efficient. Instead, I suggest you add one field to every document, like token. Then, every time you get data for the first time, generate a random token client-side, then write it to every document that you read. After that, if you detect a change in the list of ids you mention above, run a query on that token field, like :
db.collection('myCollection').where('token', '!=', locallySavedToken).get()
Remember to write your local token back to the read documents. This way, your query time will be much faster. The down side is you need 1 extra write request for every document read. If you need to re-read your data a lot then maybe look into transactions, since write requests pricing is more expensive than read requests
Database stores some data about the user which almost never change. Well sometimes information might change if the user wants to edit his name for example.
Data information is about each user's name, username and his company data.
The first two are being shown to his navigation bar all the time using ejs, like User_1 is logged in, his company profile data when he needs to create an invoice.
My current way is to fetch user data through middleware using router.use so the extracted information is always available through all routes/views, for example:
router.use(function(req, res ,next) { // this block of code is called as middleware in every route
req.getConnection(function(err,conn){
uid = req.user.id;
if(err){
console.log(err);
return next("Mysql error, check your query");
}
var query = conn.query('SELECT * FROM user_profile WHERE uid = ? ', uid, function(err,rows){
if(err){
console.log(err);
return next(err, uid, "Mysql error, check your query");
}
var userData = rows;
return next();
});
});
})
.
I understand that this is not an optimal way of passing user profile data to every route/view since it makes new DB queries every time the user navigates through the application.
What would be a better way of having this data available without repeating the same query in each route yet having them re-fetched once the user changes a portion of this data, like his fullname ?
You've just stumbled into the world of "caching", welcome! Caching is a very popular choice for use cases like this, as well as many others. A cache is essentially somewhere to store data that you can get back much quicker than making a full DB query, or a file read, etc.
Before we go any further, it's worth considering your use case. If you're serving only a few users and have a low load on your service, caching might be over-engineering and in fact making a DB request might be the simplest idea. Adding caching can add a lot of complexity to your code as things move forward, not enough to scare you, but enough to cause hard to trace bugs. So consider for a moment your service load, if it's not very high (say an internal application for somewhere you work with only maybe a few requests every few minutes) then just reading from the DB is probably not going to slow down a request too much. In this case, reading from the DB is the simplest and probably best solution. However, if you're noticing that this DB request is slowing down your application for requests or making it harder to scale up, then caching might be for you.
A really popular approach for this would be to get something like "redis" which is a key-value database that holds everything in memory (RAM). Redis can sit as a service like MySQL and has a very basic query language. It is blindingly fast and can scale to enormous loads. If you're using Express, there are a number of NPM modules that help you access a redis instance. Simply push in your credentials and you can then make GET and SET requests (to get data or to set data).
In your example, you may wish to store a users profile in a JSON format against their user id or username in redis. Then, create a function called getUserProfile which takes in the ID or username. This can then look it up in redis, if it finds the record then it can return it to your main controller logic. If it does not, it can look it up in your MySQL database, save it in redis, and then return it to the controller logic (so it'll be able to get it from cache next time).
Your next problem is known for being a very pesky problem in computer science. It's "Cache Invalidation", in this case if your user profile updates you want to "invalidate" your cache. A way of doing this would be to update your cached version when the user updates their profile (or any other data saved). Alternatively, you could also just remove the cached version from redis and then next time it's requested from getUserProfile, it will be fetched from the DB fresh, and then put into redis for next time.
There are many other ways to approach this, but this will most likely solve your problem in the simplest way without too much overhead. It will also be easy to expand in the future!
Is there any way to drop a Mongo Database Collection from within the server side JavaScript code with Meteor? (really drop the whole thing, not just Meteor.Collection.remove({}); it's contents)
In addition, is there also a way to drop a Meteor.Collection from within the server side JavaScript code without dropping the corresponding database collection?
Why do that?
Searching in the subdocuments (subdocuments of the user-document, e.g. userdoc.mailbox[12345]) with underscore or similar turns out quiet slow (e.g. for large mailboxes).
On the other hand, putting all messages (in context of the mailbox-example) of all users in one big DB and then searching* all messages for one or more particular messages turns out to be very, very slow (for many users with large mailboxes), too.
There is also the size limit for Mongo documents, so if I store all messages of a user in his/her user-document, the mailbox's maximum size is < 16 MB together with all other user-data.
So I want to have a database for each of my user to use it as a mailbox, then the maximum size for one message is 16 MB (very acceptable) and I can search a mailbox using mongo queries.
Furthemore, since I'm using Meteor, it would be nice to then have this mongo db collection be loaded as Meteor.Collection whenever a user logs in. When a user deactivates his/her account, the db should of course be dropped, if the user just logs out, only the Meteor.Collection should be dropped (and restored when he/she logs in again).
To some extent, I got this working already, each user has a own db for the mailbox, but if anybody cancels his/her account, I have to delete this particular Mongo Collection manually. Also, I have do keep all mongo db collections alive as Meteor.Collections at all times because I cannot drop them.
This is a well working server-side code snippet for one-collection-per-user mailboxes:
var mailboxes = {};
Meteor.users.find({}, {fields: {_id: 1}}).forEach(function(user) {
mailboxes[user._id] = new Meteor.Collection("Mailbox_" + user._id);
});
Meteor.publish("myMailbox", function(_query,_options) {
if (this.userId) {
return mailboxes[this.userId].find(_query, _options);
};
});
while a client just subscribes with a certain query with this piece of client-code:
myMailbox = new Meteor.Collection("Mailbox_"+Meteor.userId());
Deps.autorun(function(){
var filter=Session.get("mailboxFilter");
if(_.isObject(filter) && filter.query && filter.options)
Meteor.subscribe("myMailbox",filter.query,filter.options);
});
So if a client manipulates the session variable "mailboxFilter", the subscription is updated and the user gets a new bunch of messages in the minimongo.
It works very nice, the only thing missing is db collection dropping.
Thanks for any hint already!
*I previeously wrote "dropping" here, which was a total mistake. I meant searching.
A solution that doesn't use a private method is:
myMailbox.rawCollection().drop();
This is better in my opinion because Meteor could randomly drop or rename the private method without any warning.
You can completely drop the collection myMailbox with myMailbox._dropCollection(), directly from meteor.
I know the question is old, but it was the first hit when I searched for how to do this
Searching in the subdocuments...
Why use subdocuments? A document per user I suppose?
each message must be it's own document
That's a better way, a collection of messages, each is id'ed to the user. That way, you can filter what a user sees when doing publish subscribe.
dropping all messages in one db turns out to be very slow for many users with large mailboxes
That's because most NoSQL DBs (if not all) are geared towards read-intensive operations and not much with write-intensive. So writing (updating, inserting, removing, wiping) will take more time.
Also, some online services (I think it was Twitter or Yahoo) will tell you when deactivating the account: "Your data will be deleted within the next N days." or something that resembles that. One reason is that your data takes time to delete.
The user is leaving anyway, so you can just tell the user that your account has been deactivated, and your data will be deleted from our databases in the following days. To add to that, so you can respond to the user immediately, do the remove operation asynchronously by sending it a blank callback.
How should I design an on-login middleware that checks if the recurring subscription has failed ? I know that Stripe fires events when things happen, and that the best practice is webhooks. The problem is, I can't use webhooks in the current implementation, so I have to check when the user logs in.
The Right Answer:
As you're already aware, webhooks.
I'm not sure what you're doing that webhooks aren't an option in the current implementation: they're just a POST to a publicly-available URL, the same as any end-user request. If you can implement anything else in Node, you can implement webhook support.
Implementing webhooks is not an all-or-nothing proposition; if you only want to track delinquent payments, you only have to implement processing for one webhook event.
The This Has To Work Right Now, Customer Experience Be Damned Answer:
A retrieved Stripe Customer object contains a delinquent field. This field will be set to true if the latest invoice charge has failed.
N.B. This call may take several seconds—sometimes into the double digits—to complete, during which time your site will appear to have ceased functioning to your users. If you have a large userbase or short login sessions, you may also exceed your Stripe API rate limit.
I actually wrote the Stripe support team an email complaining about this issue (the need to loop through every invoice or customer if you're trying to pull out delinquent entries) and it appears that you can actually do this without webhooks or wasteful loops... it's just that the filtering functionality is undocumented. The current documentation shows that you can only modify queries of customers or invoices by count, created (date), and offset... but if you pass in other parameters the Stripe API will actually try to understand the query, so the cURL request:
https://api.stripe.com/v1/invoices?closed=false&count=100&offset=0
will look for only open invoices.... you can also pass a delinquent=true parameter in when looking for delinquent customers. I've only tested this in PHP, so returning delinquent customers looks like this:
Stripe_Customer::all(array(
"delinquent" => true
));
But I believe this should work in Node.js:
stripe.customers.list(
{delinquent:true},
function(err, customers) {
// asynchronously called
});
The big caveat here is that because this filtering is undocumented it could be changed without notice... but given how obvious the approach is, I'd guess that it's pretty safe.