Writing to same document in a WriteBatch causes multiple writes in Firebase? - javascript

I want to know if, in Firebase, I perform multiple updates to the same document in a batch, will it cause a single write or multiple writes?
For instance, in this example code (in javascript):
const doc = getDocumentReference();
batch.update(doc, { foo: "a" });
batch.update(doc, { bar: "b" });
batch.commit();
Will there be a single write to the document updating everything or there will be two writes, one for each update?
Does the answer change if one of the two operations is a "set" instead of an "update"?
Thanks in advance

I haven't found any reference how the writes are counted, but I found it interesting and tried to figure it out by myself. I prepared test and looked on Metrics explorer on Google Cloud Platform (Firebase database are visible there as well) and it can show the writes_count.
I prepared following codes:
adding whole object at once:
let test = db.collection("test").doc("sdaasd3")
test.set({})
let update = {}
for (let i = 0; i < 100; i++) {
update[i.toString()] = i;
}
test.update(update);
Then updating 100 times:
let test = db.collection("test").doc("sdaasd2")
test.set({})
for (let i = 0; i < 100; i++) {
let update = {}
update[i.toString()] = i;
test.update(update)
}
And in by batch commit:
let test = db.collection("test").doc("sdaasd")
test.set({})
const batch = db.batch();
for (let i = 0; i < 100; i++) {
let update = {}
update[i.toString()] = i;
batch.update(test, update)
}
batch.commit();
I performed them and it seems that batch commit is exactly the same as updating whole object at once and it takes 2 writes (I think one is set which I used to clear object in Firebase). While updating 100 times took 101 writes. Here as it looks (executed in order like in the post 12:57,01:01,01:07)
I am not sure, if this test was reliable enough to your needs, but you can use the Matrix explorer on GCP to analyze that on your own.

Related

How to perform fast search on JSON file?

I have a json file that contains many objects and options.
Each of these kinds:
{"item": "name", "itemId": 78, "data": "Some data", ..., "option": number or string}
There are about 10,000 objects in the file.
And when part of item value("ame", "nam", "na", etc) entered , it should display all the objects and their options that match this part.
RegExp is the only thing that comes to my mind, but at 200mb+ file it starts searching for a long time(2 seconds+)
That's how I'm getting the object right now:
let reg = new RegExp(enteredName, 'gi'), //enteredName for example "nam"
data = await fetch("myFile.json"),
jsonData = await data.json();
let results = jsonData.filter(jsonObj => {
let item = jsonObj.item,
itemId = String(jsonObj.itemId);
return reg.test(item) || reg.test(itemId);
});
But that option is too slow for me.
What method is faster to perform such search using js?
Looking up items by item number should be easy enough by creating a hash table, which others have already suggested. The big problem here is searching for items by name. You could burn a ton of RAM by creating a tree, but I'm going to go out on a limb and guess that you're not necessarily looking for raw lookup speed. Instead, I'm assuming that you just want something that'll update a list on-the-fly as you type, without actually interrupting your typing, is that correct?
To that end, what you need is a search function that won't lock-up the main thread, allowing the DOM to be updated between returned results. Interval timers are one way to tackle this, as they can be set up to iterate through large, time-consuming volumes of data while allowing for other functions (such as DOM updates) to be executed between each iteration.
I've created a Fiddle that does just that:
// Create a big array containing items with names generated randomly for testing purposes
let jsonData = [];
for (i = 0; i < 10000; i++) {
var itemName = '';
jsonData.push({ item: Math.random().toString(36).substring(2, 15) + Math.random().toString(36).substring(2, 15) });
}
// Now on to the actual search part
let returnLimit = 1000; // Maximum number of results to return
let intervalItr = null; // A handle used for iterating through the array with an interval timer
function nameInput (e) {
document.getElementById('output').innerHTML = '';
if (intervalItr) clearInterval(intervalItr); // If we were iterating through a previous search, stop it.
if (e.value.length > 0) search(e.value);
}
let reg, idx
function search (enteredName) {
reg = new RegExp(enteredName, 'i');
idx = 0;
// Kick off the search by creating an interval that'll call searchNext() with a 0ms delay.
// This will prevent the search function from locking the main thread while it's working,
// allowing the DOM to be updated as you type
intervalItr = setInterval(searchNext, 0);
}
function searchNext() {
if (idx >= jsonData.length || idx > returnLimit) {
clearInterval(intervalItr);
return;
}
let item = jsonData[idx].item;
if (reg.test(item)) document.getElementById('output').innerHTML += '<br>' + item;
idx++;
}
https://jsfiddle.net/FlimFlamboyant/we4r36tp/26/
Note that this could also be handled with a WebWorker, but I'm not sure it's strictly necessary.
Additionally, this could be further optimized by utilizing a secondary array that is filled as the search takes place. When you enter an additional character and a new search is started, the new search could begin with this secondary array, switching to the original if it runs out of data.

How to access the currently iterated array value in a loop?

Current attempt using an array of objects with properties:
The objective:
I want to automatically fill out emails on behalf of ~30 different people. The form fields are always consistent, but the values I'm filling in will change on an email-to-email basis. I'm using TagUI to do this.
My old code (last code box below) successfully filled out each form by assigning each line in the .csv to a separate array BUT failed to iterate through the values of a specific column within the .csv. Please see the text above the last code box below for further explanation.
Now I'm starting again, this time aiming to create an array of objects (representing each email being sent) with properties (representing each field to be filled within each email).
Here's what I've got so far:
// Using TagUI for browser automation
// https://github.com/kelaberetiv/TagUI
website-to-automate-URL-here.com
// Set up the arrays to be used later
emails = []
// Load in the 'db.csv' file
// Link to .csv: https://docs.google.com/spreadsheets/d/16iF7F-8eh2eE6kDiye0GVlmOCjADQjlVE9W1KH0Y8MM/edit?usp=sharing
csv_file = 'db.csv'
load '+csv_file+' to csv_lines
// Split the string variable "lines" into an array of individual lines
lines = csv_lines.split('\n')
// Split the individual lines up into individual properties
for (i=0; i < lines.length; i++)
{
emails[i].name = properties[1].trim()
emails[i].recipients = properties[2].trim()
properties = lines[i].split(',')
}
EDIT: The below code has been put on the back burner as I attempt to solve this another way. Solutions are still welcome.
I'm having trouble triggering my for loop (the last one in the code below).
My goal for the for loop in question, in plain English, is as follows: Repeat the below code X times, where X is determined by the current iteration of the total_images array.
So if the total_images array looks like this:
[Total Images, 2, 3, 4, 5]
And the parent for loop is on its third iteration, then this for loop should dictate that the following code is executed 4 times.
I'm using TagUI (https://github.com/kelaberetiv/TagUI), so there many be some non-Javascript code here.
https://www.website.com
wait 3s
// Setting up all the arrays that the .csv will load
array_campaign = []
array_subject = []
array_teaser = []
array_recipients = []
array_exclude = []
array_img1src = []
array_img1alt = []
array_img1url = []
array_img2src = []
array_img2alt = []
array_img2url = []
array_img3src = []
array_img3alt = []
array_img3url = []
array_img4src = []
array_img4alt = []
array_img4url = []
total_images = []
// Load in the 'db.csv' file
csv_file = 'db.csv'
load '+csv_file+' to lines
// Chop up the .csv data into individual pieces
// NOTE: Make sure the [#] corresponds to .csv column
// Reminder: Numbers start at 0
array_lines = lines.split('\n')
for (n=0; n<array_lines.length; n++)
{
items = array_lines[n].split(',')
array_campaign[n] = items[1].trim()
array_recipients[n] = items[2].trim()
array_exclude[n] = items[3].trim()
array_subject[n] = items[4].trim()
array_teaser[n] = items[5].trim()
array_img1src[n] = items[6].trim()
array_img1alt[n] = items[7].trim()
array_img1url[n] = items[8].trim()
array_img2src[n] = items[9].trim()
array_img2alt[n] = items[10].trim()
array_img2url[n] = items[11].trim()
array_img3src[n] = items[12].trim()
array_img3alt[n] = items[13].trim()
array_img3url[n] = items[14].trim()
array_img4src[n] = items[15].trim()
array_img4alt[n] = items[16].trim()
array_img4url[n] = items[17].trim()
total_images[n] = items[18].trim()
}
for (i=1; i < array_campaign.length; i++)
{
echo "This is a campaign entry."
wait 2s
}
// This is the problem loop that's being skipped
blocks = total_images[i]
for (image_blocks=0; image_blocks < blocks; image_blocks++)
{
hover vis1_3.png
click visClone.png
}
This is the most coding I've ever done, so if you could point me in the right direction and explain like I'm a beginner it would be much appreciated.
Look like the only reason make your last loop being skipped is that total_images[i] is undefined, which is used for the loop condition. I believe that the value of i at that moment is equal to array_campaign.length from the previous loop, which is actually out of array range.
Here're some example codes:
const arr = [0, 1, 2];
const length = arr.length; // the length is 3, but the last index of this array is 2 (count from 0)
for (i = 0; i < length; i++) {
console.log(i);
}
// output:
// 0
// 1
// 2
console.log(i); // i at this moment is 3, which is = arr.length and made the above loop exit
console.log(arr[i]); // => undefined, because the last index of the above array is 2, so if you reference to an un-existed element of an array, it will return undefined.
"run the following code X times, where X is determined by the value of total_images[i]" - so, if I understand your question correctly, you can use nested loops to do this:
for (i=1; i < array_campaign.length; i++)
{
echo "This is a campaign entry."
wait 2s
// nested loop, the number of iteration is based on the value i of outside loop
for (j=0; j < total_images[i]; j++) {
// do something here
}
}
My old code should have worked. I opened up the .csv file in notepad and noticed there were SEVERAL extra commas interfering with the last column of data, throwing everything for a loop.
Did some searching and apparently this is a common thing. Beware!
I created TagUI but I don't check Stack Overflow for user queries and issues. Try raising issue directly at GitHub next time - https://github.com/kelaberetiv/TagUI/issues
Looks like you found the solution! Yes, if the CSV file contains incorrect number of columns (some rows with more columns than others), it will lead to error when trying to work on it from your automation script. It looks like the extra commas cause extra columns and broke your code.

Is it worth it to convert array into set to search in NodeJS

I would like to know if it is worth to convert an array into a set in order to search using NodeJS.
My use case is that this search is done lot of times, but not necessary on big sets of data (can go up to ~2000 items in the array from time to time).
Looking for a specific id in a list.
Which approach is better :
const isPresent = (myArray, id) => {
return Boolean(myArray.some((arrayElement) => arrayElement.id === id);
}
or
const mySet = new Set(myArray)
const isPresent = (mySet, id) => {
return mySet.has(id);
}
I know that theoretically the second approach is better as it is O(1) and O(n) for the first approach. But can the instantiation of the set offset the gain on small arrays?
#jonrsharpe - particularly for your case, I found that converting an array of 2k to Set itself is taking ~1.15ms. No doubt searching Set is faster than an Array but in your case, this additional conversion can be little costly.
You can run below code in your browser console to check. new Set(arr) is taking almost ~1.2ms
var arr = [], set = new Set(), n = 2000;
for (let i = 0; i < n; i++) {
arr.push(i);
};
console.time('Set');
set = new Set(arr);
console.timeEnd('Set');
Adding element in the Set is always costly.
Below code shows the time required to insert an item in array/set. Which shows Array insertion is faster than Set.
var arr = [], set = new Set(), n = 2000;
console.time('Array');
for (let i = 0; i < n; i++) {
arr.push(i);
};
console.timeEnd('Array');
console.time('Set');
for (let i = 0; i < n; i++) {
set.add(i);
};
console.timeEnd('Set');
I run the following code to analyze the speed of locating an element in the array and set. Found that set is 8-10 time faster than the array.
You can copy-paste this code in your browser to analyze further
var arr = [], set = new Set(), n = 100000;
for (let i = 0; i < n; i++) {
arr.push(i);
set.add(i);
}
var result;
console.time('Array');
result = arr.indexOf(12313) !== -1;
console.timeEnd('Array');
console.time('Set');
result = set.has(12313);
console.timeEnd('Set');
So for your case array.some is better!
I will offer a different upside for using Set: your code is now more semantic, easier to know what it does.
Other than that this post has a nice comparison - Javascript Set vs. Array performance but make your own measurements if you really feel that this is your bottleneck. Don't optimise things that are not your bottleneck!
My own heuristic is a isPresent-like utility for nicer code but if the check is done in a loop I always construct a Set before.

How can I store composed changes using Quill?

I started to work with Quill, and I need to save the changes made by the user in the document, and if possible, composing them, so I don't need to store operation by operation.
To accomplish this, I am monitoring the 'text-change' event, and every operation is stored in the database of my application. From time to time (every minute), I compose the changes made in the document with a previous document state and execute a diff between the result of this composition and the previous document state, storing the result of the diff, and deleting the previous operations, because they are in the diff result.
To get the previous document state, initially I use the original document delta. Then, when a diff is stored, I just compose the original document delta with the diff's that exist in the database. For example:
Original document delta: {"ops":[{"insert":"Evaluation Only. Created with Aspose.Words. Copyright 2003-2018 Aspose Pty Ltd.","attributes":{"size":"16px","font":"Calibri","bold":true,"color":"#FF0000"}},{"insert":"\n","attributes":{"paragraph":true,"spacing_before":"0px","spacing_after":"10.67px","indent":0,"text_indent":"0px","line_spacing":"17.27px"}},{"insert":"Test","attributes":{"size":"14.67px","font":"Calibri","color":"#000000"}},{"insert":"s","attributes":{"size":"14.67px","font":"Calibri","color":"#000000"}},{"insert":"\n","attributes":{"paragraph":true,"spacing_before":"0px","spacing_after":"10.67px","indent":0,"text_indent":"0px","line_spacing":"17.27px"}}],"page_setup":{"left_margin":"113.4px","top_margin":"94.47px","right_margin":"113.4px","bottom_margin":"94.47px"}}
First change: {"ops":[{"delete":80}]}
Second change: {"ops":[{"retain":5},{"insert":"\n","attributes":{"spacing_before":"0px","spacing_after":"10.67px","text_indent":"0px","line_spacing":"17.27px"}}]}
Third change: {"ops":[{"retain":6},{"insert":"A","attributes":{"color":"#000000"}}]}
The code I am using is shown below:
var diffs = result.diffs;
var deltas = result.deltas;
var lastComposedDelta = null;
for (var i = 0; i < diffs.length; i++) {
var currentDelta = newDelta(diffs[i].Value);
if (lastComposedDelta == null) {
lastComposedDelta = currentDelta;
} else {
lastComposedDelta = lastComposedDelta.compose(currentDelta);
}
}
var composedDeltas = lastComposedDelta;
for (var i = 0; i < deltas.length; i++) {
var currentDelta = newDelta(deltas[i].Value);
if (composedDeltas == null) {
composedDeltas = currentDelta;
} else {
composedDeltas = composedDeltas.compose(currentDelta);
}
}
var diffDelta = composedDeltas;
if (lastComposedDelta != null) {
diffDelta = lastComposedDelta.diff(composedDeltas);
}
The result of this diff is: {"ops":[{"delete":80},{"retain":5},{"retain":1,"attributes":{"paragraph":null,"indent":null}},{"attributes":{"color":"#000000"},"insert":"A"},{"attributes":{"paragraph":true,"spacing_before":"0px","spacing_after":"10.67px","indent":0,"text_indent":"0px","line_spacing":"17.27px"},"insert":"\n"}]}
The problem I encountered is when the user inserts a new line and indent it, for example. The delta of such operations are:
New line: {"ops":[{"retain":8},{"insert":"\n"}]}
Indent: {"ops":[{"retain":9},{"retain":1,"attributes":{"indent":1}}]}
Then, when I try to diff the document, with the code above, it gives me the error:
Uncaught Error: diff() called with non-document
Value of "lastComposedDelta": {"ops":[{"insert":"Tests","attributes":{"size":"14.67px","font":"Calibri","color":"#000000"}},{"insert":"\n","attributes":{"spacing_before":"0px","spacing_after":"10.67px","text_indent":"0px","line_spacing":"17.27px"}},{"attributes":{"color":"#000000"},"insert":"A"},{"attributes":{"paragraph":true,"spacing_before":"0px","spacing_after":"10.67px","indent":0,"text_indent":"0px","line_spacing":"17.27px"},"insert":"\n"},{"delete":80},{"retain":5},{"retain":1,"attributes":{"paragraph":null,"indent":null}},{"insert":"A","attributes":{"color":"#000000"}},{"insert":"\n","attributes":{"paragraph":true,"spacing_before":"0px","spacing_after":"10.67px","indent":0,"text_indent":"0px","line_spacing":"17.27px"}}]}
Value of "composedDeltas":
{"ops":[{"insert":"Tests","attributes":{"size":"14.67px","font":"Calibri","color":"#000000"}},{"insert":"\n","attributes":{"spacing_before":"0px","spacing_after":"10.67px","text_indent":"0px","line_spacing":"17.27px"}},{"insert":"A","attributes":{"color":"#000000"}},{"insert":"\n","attributes":{"paragraph":true,"spacing_before":"0px","spacing_after":"10.67px","indent":0,"text_indent":"0px","line_spacing":"17.27px"}},{"insert":"\n"},{"delete":80},{"retain":1,"attributes":{"indent":1}},{"retain":4},{"retain":1,"attributes":{"paragraph":null,"indent":null}},{"insert":"A","attributes":{"color":"#000000"}},{"insert":"\n","attributes":{"paragraph":true,"spacing_before":"0px","spacing_after":"10.67px","indent":0,"text_indent":"0px","line_spacing":"17.27px"}}]}
I dig a little, and found out that the error is caused because there is a "retain" operation on the deltas used to diff, and it is not processed. So, I want to know if there is a solution for this, because I am unsure if the code I've made is the right way to do this (storing diffs of a document).
If you don't need each individual operation, you can just update the document on the text-change event like so:
quill.on('text-change', () => {
// By the time we hit the 'text-change' event,
// quill.getContents() will return the updated
// content of the document
const currentOps = quill.getContents();
updateDatabase(currentOps);
});
function updateDatabase(currentOps) {
// Do whatever you need to do with the current ops
// to store them. No need at all to store the diffs.
}
So, I discovered the problem with the diff function. It was because, when I initialized the editor, I was using the function updateContents to set the delta I had in the database to the editor. Quill always initialize the editor with a blank line. By calling the updateContents, it was composing the blank line with the text coming from my database. Then, when the user was changing the text, the delta from the editor was different from the delta in the database.
To fix this, I changed the function that was loading the content from the database to setContents. This way, the deltas from the editor and database matched.

how to work with a large array in javascript [duplicate]

This question already has answers here:
Best way to iterate over an array without blocking the UI
(4 answers)
Closed 6 years ago.
In my application I have a very big array (arround 60k records). Using a for loop I am doing some operations on it as shown below.
var allPoints = [];
for (var i = 0, cLength = this._clusterData.length; i < cLength; i+=1) {
if (allPoints.indexOf(this._clusterData[i].attributes.PropertyAddress) == -1) {
allPoints.push(this._clusterData[i].attributes.PropertyAddress);
this._DistClusterData.push(this._clusterData[i])
}
}
When I run this loop the browser hangs as it is very big & in Firefox is shows popup saying "A script on this page may be busy, or it may have stopped responding. You can stop the script now, or you can continue to see if the script will complete". What can I do so that browser do not hang?
You need to return control back to the browser in order to keep it responsive. That means you need to use setTimeout to end your current processing and schedule it for resumption sometime later. E.g.:
function processData(i) {
var data = clusterData[i];
...
if (i < clusterData.length) {
setTimeout(processData, 0, i + 1);
}
}
processData(0);
This would be the simplest thing to do from where you currently are.
Alternatively, if it fits what you want to do, Web Workers would be a great solution, since they actually shunt the work into a separate thread.
Having said this, what you're currently doing is extremely inefficient. You push values into an array, and consequently keep checking the ever longer array over and over for the values it contains. You should be using object keys for the purpose of de-duplication instead:
var allPoints = {};
// for (...) ...
if (!allPoints[address]) { // you can even omit this entirely
allPoints[address] = true;
}
// later:
allPoints = allPoints.keys();
First of all, avoid the multiple this._clusterData[i] calls. Extract it to a variable like so:
var allPoints = [];
var current;
for (var i = 0, cLength = this._clusterData.length; i < cLength; i+=1) {
current = this._clusterData[i];
if (allPoints.indexOf(current.attributes.PropertyAddress) == -1) {
allPoints.push(current.attributes.PropertyAddress);
this._DistClusterData.push(current)
}
}
This should boost your performance quite a bit :-)
As others already pointed out, you can do this asynchronously, so the browser remains responsive.
It should be noted however that the indexOf operation you do can become very costly. It would be better if you would create a Map keyed by the PropertyAddress value. That will take care of the duplicates.
(function (clusterData, batchSize, done) {
var myMap = new Map();
var i = 0;
(function nextBatch() {
for (data of clusterData.slice(i, i+batchSize)) {
myMap.set(data.attributes.PropertyAddress, data);
}
i += batchSize;
if (i < clusterData.length) {
setTimeout(nextBatch, 0);
} else {
done(myMap);
}
})();
})(this._clusterData, 1000, function (result) {
// All done
this._DistClusterData = result;
// continue here with other stuff you want to do with it.
}.bind(this));
Try considering adding to the array asynchronously with a list, for a set of 1000 records at a time, or for what provides the best performance. This should free up your application while a set of items is added to a list.
Here is some additional information: async and await while adding elements to List<T>

Categories

Resources