Difference between jsforce bulk api options

Difference between jsforce bulk api options - javascript

I'm using jsforce to access salesforce using the bulk api. It has two ways of updating and deleting records. One is using the normal bulk api which means creating a job and batches:
var job = conn.bulk.createJob("Account", "delete");
var batch = job.createBatch();
var accounts = getAccountsByDate(jsforce.Date.TODAY);
batch.execute(accounts);
batch.on('response', function(rets) {
// do things
});
The other way is to the "query" interface like this:
conn.sobject('Account')
.find({ CreatedDate: jsforce.Date.TODAY })
.destroy(function(err, rets) {
// do things
});
The second way certainly seems easier but I can't get it to update or delete more than 10,000 records at a time, which appears to be a salesforce api limit on batch size. Note that using maxFetch property from jsforce appears to have no effect in this case.
So is it safe to assume that the query style interface only creates a single batch? The jsforce documentation is not clear on this point.

Currently the bulk.load() method in JSforce bulk api generates a job with one batch, so the limit of 10,000 per batch will be applied. It is also true when using find-and-destroy interface, which uses bulk.load() internally.
To avoid this limit you can create a job by bulk.createJob() and create several batches by job.createBatch(), then dispatch the records to delete into these batches so that each records will not exceed the limit.

Related

Angular: Increase Query Loading Time in Firebase Database

I have an angular app where i am querying my firebase database as below:
constructor() {
this.getData();
}
getData() {
this.projectSubscription$ = this.dataService.getAllProjects()
.pipe(
map((projects: any) =>
projects.map(sc=> ({ key: sc.key, ...sc.payload.val() }))
),
switchMap(appUsers => this.dataService.getAllAppUsers()
.pipe(
map((admins: any) =>
appUsers.map(proj =>{
const match: any = admins.find(admin => admin.key === proj.admin);
return {...proj, imgArr: this.mapObjectToArray(proj.images), adminUser: match.payload.val()}
})
)
)
)
).subscribe(res => {
this.loadingState = false;
this.projects = res.reverse();
});
}
mapObjectToArray = (obj: any) => {
const mappedDatas = [];
for (const key in obj) {
if (Object.prototype.hasOwnProperty.call(obj, key)) {
mappedDatas.push({ ...obj[key], id: key });
}
}
return mappedDatas;
};
And here is what I am querying inside dataService:
getAllProjects() {
return this.afDatabase.list('/projects/', ref=>ref.orderByChild('createdAt')).snapshotChanges();
}
getAllAppUsers() {
return this.afDatabase.list('/appUsers/', ref=>ref.orderByChild('name')).snapshotChanges();
}
The problem I am facing with this is I have 400 rows of data which I am trying to load and it is taking around 30seconds to load which is insanely high. Any idea how can I query this in a faster time?

We have no way to know whether the 30s is reasonable, as that depends on the amount of data loaded, the connection latency and bandwidth of the client, and more factors we can't know/control.
But one thing to keep in mind is that you're performing 400 queries to get the users of each individual app, which is likely not great for performance.
Things you could consider:
Pre-load all the users once, and then use that list for each project.
Duplicate the name of each user into each project, so that you don't need to join any data at all.
If you come from a background in relational databases the latter may be counterintuitive, but it is actually very common in NoSQL data modeling and is one of the reasons NoSQL databases scale so well.

I propose 3 solutions.
1. Pagination
Instead of returning all those documents on app load, limit them to just 10 and keep record of the last one. Then display the 10 (or any arbitrary base number)
Then make the UI in such a way that the user has to click next or when the user scrolls, you fetch the next set based on the previous last document's field's info.
I'm supposing you need to display all the fetched data in some table or list so having the UI paginate the data should make sense.
2. Loader
Show some loader UI on website load. Then when all the documents have fetched, you hide the loader and show the data as you want. You can use some custom stuff for loader, or choose from any of the abundant libraries out there, or use mat-progress-spinner from Angular Material
3. onCall Cloud Function
What if you try getting them through an onCall cloud function? It night be faster because it's just one request that the app will make and Firebase's Cloud Functions are very fast within Google's data centers.
Given that the user's network might be slow to iterate the documents but the cloud function will return all at once and that might give you what you want.
I guess you could go for this option only if you really really need to display all that data at once on website load.
... Note on cost
Fetching 400 or more documents every time a given website loads might be expensive. It'll be expensive if the website is visited very frequently by very many users. Firebase cost will increase as you are charged per document read too.
Check to see if you could optimise the data structure to avoid fetching this much.
This doesn't apply to you if this some admin dashboard or if fetching all users like this is done rarely making cost to not be high in that case.

Performing Data Cleanup In Mongodb

My application tracks the movements of data throughout the system. When a movement is recorded it is placed in a separate collection that determines whether the document is enroute, available or out of service. I used $addToSet to place the _id, and $pullAll to make sure that when a doc is moved from enroute to available, it is not duplicated. But when the _id is moved to a new location entirely, I need to remove the old data from the old location and insert it into the new location. The insertion works but I cannot figure out how to properly remove the data from the old location. These are all down within Meteor Calls and Mongodb
if last.status is "Enroute"
LastLocation.update locationId: last.locationId,partId: last.partId,
$addToSet:
enroutePurchaseIds: lastPurchaseId
$pullAll:
availiblePurchaseIds: lastPurchaseId
outOfServicePurchaseIds: lastPurchaseId

Update
You can run the merge command from upcoming 4.4 version which allows updating the same collection the aggregation is running on. Pass the array as old location and new location
db.collection.aggregate([
{"$match":{"location":{"$in":[oldLocation,newLocation]}}},
{"$addFields":{"sortOrder":{"$indexOfArray":[[oldLocation,newLocation],"$location"]}}},
{"$sort":{"sortOrder":1}},
{"$group":{
"_id":null,
"oldLocationDoc":{"$first":"$$ROOT"},
"newLocationDoc":{"$last":"$$ROOT"}
}},
{"$addFields":{
"oldLocationDoc.old":{
"$filter":{
"input":"$oldLocationDoc.old",
"cond":{"$ne":["$$this",oldLocation]}
}
},
"newLocationDoc.new":{"$concatArrays":["$newLocationDoc.new",[newLocation]]}
}},
{"$project":{"locations":["$oldLocationDoc","$newLocationDoc"]}},
{"$unwind":"$locations"},
{"$replaceRoot":{"newRoot":"$locations"}},
{"$merge":{
"into":{"db":"db","coll":"collection"},
"on":"_id",
"whenMatched":"merge",
"whenNotMatched":"failed"
}}
]
Original
Not possible to move array/field value from one document to another document in a single update operation.
You would want to use transactions to perform multi document updates in a atomic way. Requires replica set.
var session = db.getMongo().startSession();
var collection = session.getDatabase('test').getCollection('collection');
session.startTransaction({readConcern: {level:'snapshot'},writeConcern: {w:'majority'}});
collection.update({location:oldLocation},{$pull:{availiblePurchaseIds:lastPurchaseId}});
collection.update({location:newLocation},{$push:{enroutePurchaseIds:lastPurchaseId}});
session.commitTransaction()
session.endSession()
Other options would be to perform bulk updates in case of standalone mongod instance.
var bulk = db.getCollection('collection').initializeUnorderedBulkOp();
bulk.find({location:oldLocation}).updateOne({$pull:{availiblePurchaseIds:lastPurchaseId}});
bulk.find({location:newLocation}).updateOne({$push:{enroutePurchaseIds:lastPurchaseId}});
bulk.execute();

Are you moving the entire document from one collection to another or just moving the document's id? I can't help much with coffeescript but if you're looking to move entire documents you might find the following thread helpful.
mongodb move documents from one collection to another collection

What's remove() and save() mean in mongodb node.js when initializing one database

I am newly using node.js. I am reading the code of one app. The code below is to initialize the db, to load some question into the survey system I can't understand what's remove() and save() means here. Because I can't find any explanation about these two method. It seems mongoose isn't used after being connected. Could any one explain the usage of these methods?
Well, this is my understanding of this code, not sure to be correct. My TA tell me it should be run before server.js.
/**
* This is a utility script for dropping the questions table, and then
* re-populating it with new questions.
*/
// connect to the database
var mongoose = require('mongoose');
var configDB = require('./config/database.js');
mongoose.connect(configDB.url);
// load the schema for entries in the 'questions' table
var Question = require('./app/models/questions');
// here are the questions we'll load into the database. Field names don't
// quite match with the schema, but we'll be OK.
var questionlist = [
/*some question*/
];
// drop all data, and if the drop succeeds, run the insertion callback
Question.remove({}, function(err) {
// count successful inserts, and exit the process once the last insertion
// finishes (or any insertion fails)
var received = 0;
for (var i = 0; i < questionlist.length; ++i) {
var q = new Question();
/*some detail about defining q neglected*/
q.save(function(err, q) {
if (err) {
console.error(err);
process.exit();
}
received++;
if (received == questionlist.length)
process.exit();
});
}
});

To add some additional detail, mongoose is all based on using schemas and working with those to manipulate your data. In a mongodb database, you have collections, and each collection holds different kinds of data. When you're using mongoose, what's happening behind the scenes is every different Schema you work with maps to a mongodb collection. So when you're working with Question Schema in mongoose land, there's really some Question collection behind the scenes in the actual db that your working with. You might also have a Users Schema, which would act as an abstraction for some Users collection in the db, or maybe you could have a Products Schema, which again would map to some collection of products behind the scenes in the actual db.
As mentioned previously, when calling remove({}, callback) on the Questions Schema, you're telling mongoose to go find the Questions collection in the db and remove all entries, or documents as they're called in mongodb, that match a certain criteria. You specify that criteria in the object literal that is passed in as the first argument. So if the Questions Schema has some boolean field called correct and you wanted to delete all of the incorrect questions, you could say Question.remove({ correct: false }, callback). Also as mentioned previously, when passing an empty object to remove, your telling mongoose to remove ALL documents in the Schema, or collection rather. If you're not familiar with callback functions, pretty much the callback function says, "hey after you finish this async operation, go ahead and do this."
The save() function that is used here is a little different than how save() is used in the official mongodb driver, which is one reason why I don't prefer mongoose. But to explain, pretty much all save is doing here is you're creating this new question, referred to by the q variable, and when you call save() on that question object, you're telling mongoose to take that object and insert it as a new document into your Questions collection behind the scenes. So save here just means insert into the db. If you were using the official mongo driver, it would be db.getCollection('collectionName').insert({/* Object representing new document to insert */}).
And yes your TA is correct. This code will need to run before your server.js file. Whatever your server code does, I assume it's going to connect to your database.
I would encourage you to look at the mongoose API documentation. Long term though, the official mongodb driver might be your best bet.

Mongoose basically maps your MongoDB queries to JavaScript objects using schema.
remove() receives a selector, and callback function. Empty selector means, that all Questions will be affected.
After that a new Question object is created. I guess that you omitted some data being set on it. After that it's being saved back into MongoDB.
You can read more about that in the official documentation:
http://mongoosejs.com/docs/api.html#types-subdocument-js

remove query is use for removing all documents from collection and save is use for creating new document.
As per your code it seems like every time the script run it removes all the record from Question collection and then save new records for question from question list.

nedb method update and delete creates a new entry instead updating existing one

I'm using nedb and I'm trying to update an existing record by matching it's ID, and changing a title property.
What happens is that a new record gets created, and the old one is still there.
I've tried several combinations, and tried googling for it, but the search results are scarce.
var Datastore = require('nedb');
var db = {
files: new Datastore({ filename: './db/files.db', autoload: true })
};
db.files.update(
{_id: id},
{$set: {title: title}},
{},
callback
);
What's even crazier when performing a delete, a new record gets added again, but this time the record has a weird property:
{"$$deleted":true,"_id":"WFZaMYRx51UzxBs7"}
This is the code that I'm using:
db.files.remove({_id: id}, callback);

In the nedb docs it says followings :
localStorage has size constraints, so it's probably a good idea to set
recurring compaction every 2-5 minutes to save on space if your client
app needs a lot of updates and deletes. See database compaction for
more details on the append-only format used by NeDB.
 
Compacting the database
Under the hood, NeDB's persistence uses an append-only format, meaning
that all updates and deletes actually result in lines added at the end
of the datafile. The reason for this is that disk space is very cheap
and appends are much faster than rewrites since they don't do a seek.
The database is automatically compacted (i.e. put back in the
one-line-per-document format) everytime your application restarts.
You can manually call the compaction function with
yourDatabase.persistence.compactDatafile which takes no argument. It
queues a compaction of the datafile in the executor, to be executed
sequentially after all pending operations.
You can also set automatic compaction at regular intervals with
yourDatabase.persistence.setAutocompactionInterval(interval), interval
in milliseconds (a minimum of 5s is enforced), and stop automatic
compaction with yourDatabase.persistence.stopAutocompaction().
Keep in mind that compaction takes a bit of time (not too much: 130ms
for 50k records on my slow machine) and no other operation can happen
when it does, so most projects actually don't need to use it.
I didn't use this but it seems , it uses localStorage and it has append-only format for update and delete methods.
When investigated its source codes, in that search in persistence.tests they wanted to sure checking $$delete key also they have mentioned `If a doc contains $$deleted: true, that means we need to remove it from the data``.
So, In my opinion you can try to compacting db manually, or in your question; second way can be useful.

How to perform sql "LIKE" operation on firebase?

I am using firebase for data storage. The data structure is like this:
products:{
product1:{
name:"chocolate",
}
product2:{
name:"chochocho",
}
}
I want to perform an auto complete operation for this data, and normally i write the query like this:
"select name from PRODUCTS where productname LIKE '%" + keyword + "%'";
So, for my situation, for example, if user types "cho", i need to bring both "chocolate" and "chochocho" as result. I thought about bringing all data under "products" block, and then do the query at the client, but this may need a lot of memory for a big database. So, how can i perform sql LIKE operation?
Thanks

Update: With the release of Cloud Functions for Firebase, there's another elegant way to do this as well by linking Firebase to Algolia via Functions. The tradeoff here is that the Functions/Algolia is pretty much zero maintenance, but probably at increased cost over roll-your-own in Node.
There are no content searches in Firebase at present. Many of the more common search scenarios, such as searching by attribute will be baked into Firebase as the API continues to expand.
In the meantime, it's certainly possible to grow your own. However, searching is a vast topic (think creating a real-time data store vast), greatly underestimated, and a critical feature of your application--not one you want to ad hoc or even depend on someone like Firebase to provide on your behalf. So it's typically simpler to employ a scalable third party tool to handle indexing, searching, tag/pattern matching, fuzzy logic, weighted rankings, et al.
The Firebase blog features a blog post on indexing with ElasticSearch which outlines a straightforward approach to integrating a quick, but extremely powerful, search engine into your Firebase backend.
Essentially, it's done in two steps. Monitor the data and index it:
var Firebase = require('firebase');
var ElasticClient = require('elasticsearchclient')
// initialize our ElasticSearch API
var client = new ElasticClient({ host: 'localhost', port: 9200 });
// listen for changes to Firebase data
var fb = new Firebase('<INSTANCE>.firebaseio.com/widgets');
fb.on('child_added', createOrUpdateIndex);
fb.on('child_changed', createOrUpdateIndex);
fb.on('child_removed', removeIndex);
function createOrUpdateIndex(snap) {
client.index(this.index, this.type, snap.val(), snap.name())
.on('data', function(data) { console.log('indexed ', snap.name()); })
.on('error', function(err) { /* handle errors */ });
}
function removeIndex(snap) {
client.deleteDocument(this.index, this.type, snap.name(), function(error, data) {
if( error ) console.error('failed to delete', snap.name(), error);
else console.log('deleted', snap.name());
});
}
Query the index when you want to do a search:
<script src="elastic.min.js"></script>
<script src="elastic-jquery-client.min.js"></script>
<script>
ejs.client = ejs.jQueryClient('http://localhost:9200');
client.search({
index: 'firebase',
type: 'widget',
body: ejs.Request().query(ejs.MatchQuery('title', 'foo'))
}, function (error, response) {
// handle response
});
</script>
There's an example, and a third party lib to simplify integration, here.

I believe you can do :
admin
.database()
.ref('/vals')
.orderByChild('name')
.startAt('cho')
.endAt("cho\uf8ff")
.once('value')
.then(c => res.send(c.val()));
this will find vals whose name are starting with cho.
source

The elastic search solution basically binds to add set del and offers a get by wich you can accomplish text searches.
It then saves the contents in mongodb.
While I love and reccomand elastic search for the maturity of the project, the same can be done without another server, using only the firebase database.
That's what I mean:
(https://github.com/metaschema/oxyzen)
for the indexing part basically the function:
JSON stringifies a document.
removes all the property names and JSON to leave only the data
(regex).
removes all xml tags (therefore also html) and attributes (remember
old guidance, "data should not be in xml attributes") to leave only
the pure text if xml or html was present.
removes all special chars and substitute with space (regex)
substitutes all instances of multiple spaces with one space (regex)
splits to spaces and cycles:
for each word adds refs to the document in some index structure in
your db tha basically contains childs named with words with childs
named with an escaped version of "ref/inthedatabase/dockey"
then inserts the document as a normal firebase application would do
in the oxyzen implementation, subsequent updates of the document ACTUALLY reads the index and updates it, removing the words that don't match anymore, and adding the new ones.
subsequent searches of words can directly find documents in the words child. multiple words searches are implemented using hits

SQL"LIKE" operation on firebase is possible
let node = await db.ref('yourPath').orderByChild('yourKey').startAt('!').endAt('SUBSTRING\uf8ff').once('value');

This query work for me, it look like the below statement in MySQL
select * from StoreAds where University Like %ps%;
query = database.getReference().child("StoreAds").orderByChild("University").startAt("ps").endAt("\uf8ff");

Develop Reference

JavaScript is the programming language of the Web.