I am using Firebase for my web app and fairly new with it.
As I read the documentation, it mentioned that in order to minimize the size of download (for cost saving reason), we should flatten the data structure so that we will not download unnecessary data. But throughout the doc, they will always reference the whole database first:
dbRef = firebase.database().ref();
then only sample the data with key:
childRef = dbRef.child(child_key)
I am not sure I got this right but at least this is how I understand it.
My question is, won't that dbRef already taken the whole database down? Or the download only happen during the childRef in the scenario above?
Any information will be helpful as I google and found some nightmare cases with unbelievable price because of this database issue that is not handle correctly.
Is there other issue that I need to worry about now since I am at the beginning stage of development?
A Reference that you get from ref() and child() is just a pointer into a location in the database. It's exceptionally cheap, and creating one doesn't perform any data access.
If you want to actually fetch the data from a reference, you have to call on() or once() on it. Until then, all you have is a tiny object that contains a location. Same thing with Query objects. They don't perform any queries until you call one of those same methods.
There is a difference between a database ref and the actual database data request ( on('value') and once('value'))
The database reference is representing a particular location (or child/node/ref) in your database.
The moment you call one of these methods on the Reference object (child() also returns a Reference object) you are actually fetching data, which is the expensive part.
Besides that, it's always a good thing to have just one variable thats holding a reference.
Related
When I'm offline, if I add an object to a path where a cloud function is listening, and then I delete it while still offline, when going online, Firebase servers will receive the object creation, then right after its deletion.
The problem is that it will trigger, on creation, a cloud function. This cloud function will catch some data at another path and will add that data in the object that was created. But because the object was deleted while offline, it ends up being deleted. But the cloud function will recreate it (partially) when adding the data it went to grab somewhere else.
Because I don't want to have to track every single object I create/delete, I thought about checking if the object would still exist right before saving that data. The problem is that when I do so, the object still exist but by the time I save the data into it, it doesn't exist anymore.
What are my options? I thought about adding a 0.5s sleep but I don't think it's the best practice.
First of all, there's not much you can do on the client app to help this situation. Everything you do to compensate for this will be in Cloud Functions.
Second of all, you have to assume that events could be delivered out of order. Deletes could be processed by Cloud Functions before creates. If your code does not handle this case, you can expect inconsistency.
You said "I don't want to have to track every single object I create/delete", but the fact of the matter is that this is the best option if you want consistent handling of events that could happen out of order. There is no easy way out of this situation if you're using Cloud Functions. On top of that, your functions should be idempotent, so they can handle events that could be delivered more than once.
One alternative is to avoid making changes to documents, and instead push "command" objects to Cloud Functions that tell it the things that should change. This might help slightly, but you should also assume that these commands could arrive out of order.
This is all part of the downside of serverless backends. The upside is that you don't have to set up, manage, and deallocate server instances. But your code has to be resilient to these issues.
I'm creating an events app with react native. I just wanted some advice on which would be the better more performant and scalable way to structure my data model in firestore.
I have 2 collections Events and Users.
A user creates an event which goes into the Event collection, In my app users can then go onto the main page and view a list of events from the events collection.
I also want to have a second page in the app a "users profile" page where users can view a list of their own events, update and delete them.
My question is which would be better:
to store the event's key in an array in users/user1
store basically a duplicate event in a subcollection called events in users/user1
I feel that option 1, might be better just to store a reference to the doc in an array, So I don't have duplicates of the event and if the user has to update the event, only 1 write has to be made to the actual event in the events collections.
the event is likely to have more fields come onto it in the future, like a comments field etc, so I feel by just going with option 1 I dont have to keep doing double work, although I might have to read twice i.e read users/user1- > (then array) events:[event:{dockey}], then use that key to get the actual event doc in the events collection.
Thank you for any feedback and advice
There is no simple right or wrong answer when you need to choose between these two options. Data duplication is the key to faster reads, not just in Firebase Realtime Database or Cloud Firestore, but in general. Any time you add the same data to a different location, you're duplicating data in favor of faster read performance. Unfortunately in return, you have a more complex update and higher storage/memory usage. But you need to note that extra calls in the Firebase Realtime Database are not expensive, in Firestore are. How much duplication data versus extra database calls is optimal for you, depends on your needs and your willingness to let go of the "Single Point of Definition mindset", which can be also called very subjective.
After finishing a few Firebase projects, I find that my reading code gets drastically simpler if I duplicate data. But of course the writing code gets more complex at the same time. It's a trade-off between these two and your needs that determines the optimal solution for your app.
Please also take a look at my answer from this post where I have explained more about collections, maps and arrays in Firestore.
I'm using jasmine-node to test my API, and it has worked great for my GET routes. Now, however, I need to test some POSTs and I'm not sure how to go about this without changing my database.
One thought I had was to reset whatever value I change at the end of each spec.
Is this reasonable or is there a better way to go about testing POST requests to my API?
Wrap anything that modifies your database into a transaction. You can have your database changes and then rollback after each test.
usually you are supposed to have a test database, so modify that one is not a big issue. also, a general approach would be not to rely on predefined values on the database (i.e, the GET always request the SAME object..) but try with different objects each time. (using predefined objects may hide problems when the data is slighty different..).
in order to implement the second strategy, you can execute a test with a POST with pseudo-random data to create a new object, and use the returned ID to feed the following GET, UPDATE and finally the DELETE tests.
Just make a duplicate processing page/function and send the data to that for debugging. Comment out anything that makes changes to the database.
Alternatively, pass a variable in your call such as "debug" and have an if/else section in your original function for debugging, ignoring the rest of the function.
Another alternative still is to duplicate your database table and name it debug table. It will have the same structure as your original. Send the test data to it instead and it won't change your original database tables.
I'm pretty sure that you've come up with some solution for your problem already.
BUT, if you don't, the Angular $httpBackend will solve your problem. It is a
Fake HTTP backend implementation suitable for unit testing applications that use the $http service.
I have a long-polling web service. Recent response is cached. Web service notifies subscribers whenever new data is available.
Is it a good practice to return a deep copy of a response or should the data be shared with all the subscribers? Or does it simply depend on use case?
At first glance, I think this looks like a case where you need to look broader on the usage.
What is this service providing. It is read-only data or is the service also responsible on managing modifications to the data? Once you answer that, it might be easier so see how you should handle it.
But the truth is, the service should own the data and no subscriber of the data should be able to tamper with the result. If the subscribers should only read from the data, this should be enforced but even if they are ment to modify the data, doing so directly on the object itself is really bad practice - especially/specifically because there are many subscribers to the same data.
In the end, I believe you need to provide each subscriber with a deep copy of the data, and if that is too ressource comsuming, consider providing the subscribers with a shallow copy and expose an interface to query the nested data - which would also return copies of the data.
Any subscribers of the data should pass the data to a method responsible for saving the data, like this service.SaveSomeInformation(responseInformation).
TL;DR
Do not share the reference of the data with the subscribers, provide each subscriber with a copy of (a subset) of the data.
Good luck
Credible source
http://martinfowler.com/eaaCatalog/serviceLayer.html
Defines an application's boundary with a layer of services that
establishes a set of available operations and coordinates the
application's response in each operation.
and
... . It encapsulates the application's business logic, controlling
transactions and coor-dinating responses in the implementation of its
operations.
Basically, never deep copy something until you really need to.
So start by sharing your response with everyone.
Then if a subsriber needs to modify the data, for instance to complete it in a way that wouldn't break what other subscribers do but can only better serve them, then it is more effective to keep the reference both in term of memory space and performances.
However if a subscriber needs to do specific formatting/editing of the data privately, you should still think about weither you really need a deep copy, or could you simply copy the relevant information.
For instance let's say you get
scope.response = bigDataObject;
Somewhere an observer need to edit a specific property:
scope.data = angular.copy( bigDataObject.some.deeper.property );
scope.data.name = 'somethingElse';
This way tou didn't share the private edit, but neither did you deep copy the whole data. By only dealing with the smaller subset of data you actually need, you prevent side effects, save performances and memory, and keep the code understandable maintenable.
In a Google spreadsheet using the Script Editor, I do function calls, but I am not quite sure if the best way to store persistant data (data that I will continue to use) is to use global variables (using objects, arrays, strings), or there is a better way to store data.
I don't want to use cells which could be another way.
Another question, is it possible to create (pseudo) classes in this environment? Best way?
Both ScriptProperties and ScriptDB are deprecated.
Instead, you should be using the new class PropertiesService which is split into three sections of narrowing scope:
Document - Gets a property store that all users can access within the current document, if the script is published as an add-on.
Script - Gets a property store that all users can access, but only within this script.
User - Gets a property store that only the current user can access, and only within this script.
Here's an example persisting a user property across calls:
var properties = PropertiesService.getScriptProperties();
function saveValue(lastDate) {
properties.setProperty('lastCalled', lastDate);
}
function getValue() {
return properties.getProperty('lastCalled');
}
The script execution environment is stateless, so you cannot access local variables from previous runs, but you can store getScriptProperties() in a local variable because it will be re-run for each return trip to the server so it can be called in either method.
If you need to store something on a more temporary basis, you can use the CacheService API
Persistent data can be stored using the Class ScriptProperties:
http://code.google.com/googleapps/appsscript/class_scriptproperties.html
All values are stored as a string and will have to be converted back with the likes or parsInt or parseFloat when they are retrieved.
JSON objects can also be stored in this manner.
My experience has been that every query to retrieve or store values takes a long time. At the very least, I would cache the information in your javascript code as much as possible when it is safe. My scripts always execute all at once, so I don't need to keep global variables as I simply pass the retrieved data arrays around, manipulate them, and finally store them back in one fell swoop. If I needed persistence across script invocations and I didn't care about dropping intermediate values on close of the webpage, then I'd use globals. Clearly you have to think about what happens if your script is stopped in the middle and you haven't yet stored the values back to Google.