MongoDb bulk insert limit issue - javascript

Im new with mongo and node. I was trying to upload a csv into the mongodb.
Steps include:
Reading the csv.
Converting it into JSON.
Pushing it to the mongodb.
I used 'csvtojson' module to convert csv to json and pushed it using code :
MongoClient.connect('mongodb://127.0.0.1/test', function (err, db) { //connect to mongodb
var collection = db.collection('qr');
collection.insert(jsonObj.csvRows, function (err, result) {
console.log(JSON.stringify(result));
console.log(JSON.stringify(err));
});
console.log("successfully connected to the database");
//db.close();
});
This code is working fine with csv upto size 4mb; more than that its not working.
I tried to console the error
console.log(JSON.stringify(err));
it returned {}
Note: Mine is 32 bit system.
Is it because there a document limit of 4mb for 32-bit systems?
I'm in a scenario where I can't restrict the size and no.of attributes in the csv file (ie., the code will be handling various kinds of csv files). So how to handle that? I there any modules available?

If you are not having a problem on the parsing the csv into JSON, which presumably you are not, then perhaps just restrict the list size being passed to insert.
As I can see the .csvRows element is an array, so rather than send all of the elements at once, slice it up and batch the elements in the call to insert. It seems likely that the number of elements is the cause of the problem rather than the size. Splitting the array up into a few inserts rather than 1 should help.
Experiment with 500, then 1000 and so on until you find a happy medium.
Sort of coding it:
var batchSize = 500;
for (var i=0; i<jsonObj.csvRows.length; i += batchSize) {
var docs = jsonObj.csvRows.slice(i, i+(batchSize -1));
db.collection.insert( docs, function(err, result) {
// Also don't JSON covert a *string*
console.log(err);
// Whatever
}
}
And doing it in chunks like this.

You can make those data as an array of elements , and then simply use the MongoDB insert function, passing this array to the insert function

Related

PostgreSQL: INSERT Error while reading data from CSV File

I am inserting values from a CSV file into a postgresql table, the code used to work fine earlier, but now that I'm on my local machine, it refuses despite so many different attempts.
const query =
"INSERT INTO questions VALUES (DEFAULT,$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12)";
questionData.forEach((row) => {
questions.push(
db.query(query, row).catch((err) => {
console.log(err);
})
);
});
This is my insertion logic, the questionData just holds every row of the CSV file, and questions is the array of promises which I Promise.all() in the end.
The Error I get is in this link
I am going crazy trying to fix this, I have changed absolutely nothing in the backend, my CSV files have only 12 rows which are the one's I'm trying to insert.
Edit:
What is 5+3,qwerty,mcq,chemistry,cat,cat,easy,2001,FALSE,nah,{8},"{7,2,8,3}"
What is 5+4,qwerty,mcq,maths,cat,cat,easy,2002,FALSE,nah,{9},"{7,9,5,3}"
What is 5+5,qwerty,mcq,physics,cat,cat,easy,2003,FALSE,nah,{10},"{7,2,10,3}"
What is 5+6,qwerty,mcq,chemistry,cat,cat,easy,2004,FALSE,nah,{11},"{11,2,5,3}"
What is 5+7,qwerty,mcq,maths,cat,cat,easy,2005,FALSE,nah,{12},"{7,2,12,3}"
What is 5+8,qwerty,mcq,physics,cat,cat,easy,2006,FALSE,nah,{13},"{13,2,5,3}"
What is 5+9,qwerty,mcq,chemistry,cat,cat,easy,2007,FALSE,nah,{14},"{7,14,5,3}"
What is 5+10,qwerty,mcq,maths,cat,cat,easy,2008,FALSE,nah,{15},"{7,2,15,3}"
This is my CSV
The error states that you're trying to push more values than the columns in your table.
i see your insert statement has a DEFAULT first parameter... what is that about?
If your target table has only 12 columns then you should be inserting the following
INSERT INTO questions VALUES ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12)

MongoImport csv combine/concat various columns to one array for import

I have another interesting case which I have never faced before, so I'm asking help from SO community and also share my experience with it.
The case || What we have:
A csv file (exported from other SQL DB) with such structure
(headers):
ID,SpellID,Reagent[0],Reagent[1..6]Reagent[7],ReagentCount[0],ReagentCount[1..6],ReagentCount[7]
You could also check a full -csv data file here, at my
dropbox
My gist from Github, which helps you to understand how MongoImport works.
What we need:
I'd like to receive such structure(schema) to import it into MongoDB collection:
ID(Number),SpellID(Number),Reagent(Array),ReagentCount(Array)
6,898,[878],[1]
with ID, SpellID, and two arrays, in first we store all Reagent IDs, like [0,1,2,3,4,5,6,7] from all Reagent[n] columns, and in the second array we have the array with the same length that represent quantity of ReagentIDs, from all ReagentCount[n]
OR
A transposed objects with such structure (schema):
ID(Number),SpellID(Number),ReagentID(Number),Quantity/Count(Number)
80,2675,1,2
80,2675,134,15
80,2675,14,45
As you may see, the difference between the first example and this one, that every document in the collection represents each ReagentID and it's quantity to SpellID. So if one Spell_ID have N different reagents it will be N documents in the collection, cause we all know, that there can't be more then 7 unique Reagent_ID belonging to one Spell_ID according to our -csv file.
I am working on this problem right now, with the help of node js and npm i csv (or any other modules for parsing csv files). Just to make my csv file available for importing to my DB via mongoose. I'll be very thankful for all those, who could provide any relevant contribution to this case. But anyway, I will solve this problem eventually and share my solution in this question.
As for the first variant I guess there should be one-time script for MongoImport that could concat all columns from Reagent[n] & ReagentCount[n] to two separate arrays like I mentioned above, via -fields but unfortunately I don't know it, and there are no examples on SO or official Mongo docs relevant to it. So if you have enough experience with MongoImport feel free to share it.
Finally I solve my problem as I want it to, but without using mongoimport
I used npm i csv and write function for parsing my csv file. In short:
async function FuncName (path) {
try {
let eva = fs.readFileSync(path,'utf8');
csv.parse(eva, async function(err, data) {
//console.log(data[0]); we receive headers, if they exist
for (let i = 1; i < data.length; i++) { //we start from 1, because 0 is headers, if we don't have it, then we start from 0
console.log(data[i][34]); //where i is row number and j(34) is a header address
}
});
} catch (err) {
console.log(err);
}
}
It loops over csv file and shows data in array that allows you to operate with them as you want it to.

In sails/waterline get maximum value of a column in a database agnostic way

While using sails as ORM (version 1.0), I notice that there is a function called Model.avg (as well as sum). - However there is not a maximum or minimum function to get the maximum or minimum from a column in a model; so it seems this is not necessary because it is covered by other functions already?
Now in my database I need to get the "maximum id" in a list; and I have it working for postgresql by using a native query:
const maxnum = await Order.getDatastore().sendNativeQuery('SELECT MAX(\"orderNr\") FROM \"order\"')
While this isn't the most difficult thing, it is not what I truly want: it is limited to only sql-based datastores (so we wouldn't be able to move easily to mongodb); and the syntax might actually be even different for another sql database type.
So I wonder - can this be transformed in such a way it doesn't rely on sendNativeQuery?
You can try .query() to execute a raw SQL query using the specified model's datastore and if u want u can try pg , an NPM package used for communicating with PostgreSQL databases:
Pet.query('SELECT pet.name FROM pet WHERE pet.name = $1', [ 'dog' ]
,function(err, rawResult) {
if (err) { return res.serverError(err); }
sails.log(rawResult);
// (result format depends on the SQL query that was passed in, and
the adapter you're using)
// Then parse the raw result and do whatever you like with it.
return res.ok();
});
You can use the limit and order options waterline provides to get a single Model with a maximal value (then just extract that value).
const orderModel = await Order.find({
where: {},
select: ['orderNr'],
limit: 1,
sort: 'orderNr DESC'
});
console.log(orderModel.orderNr);
Like most things in Waterline, it's probably not as efficient as an SQL SELECT MAX query (or some equivalent in mongo, etc), but it should allow swapping out the database with no maintenance. Last note, don't forget to handle the case of no models found.

How to send multiple image as response in node.js

i have stored more image in mongodb by using like this
exports.save = function (input, image, callback)
{
db.collection("articles", function (error, collection)
{
collection.save(input, {safe: true}, callback);
});
}
i have retrieved all image from db.
find().toArray(function(err,result)
{
console.log(result.length)//length is 5
for(var i=0;i<results.length;i++)
{
res.contentType(results[i].imageType);
res.end(results[i].image.buffer, "binary");
}
});
how to convert this image as binary and how to send response to client.i have tried for loop but i got this error can't set header after send res.........how to solve
There is no way to send multiple images within a one single response, I guess. But maybe you could try to merge those images somehow within your loop, and then send a one big image which will consists of your N images merged together.
I have just had a similar problem. You need to pass your 'results' array to a Jade template (for example) or manually create an HTML page with the images that you find (not the data) - rather virtual links to the actual images.
res.render('image_list.jade', {results:results});
Then your template might look like :
block content
div.container
header
h1 Images in a list
div#images
- each result in results
div.myimage
img(src='http://localhost:3000/oneimage'+result._id)
In your app.js you would need to provide a route /oneimage that returns a single image (data) as you have done in your code (i.e replace findAll(...) with findOne(...., id, ...)

MongoDB: Javascript out of memory

I'm generating some random data in order to store them in mongodb.
By generating a lot of data and storing them first in an array (for seperating the generation from inserting for measuring) an out of memory error occurs.
The code:
for (i=0; i<amount; i++)
{
doc = {starttime:get_datetime(), endtime:get_datetime(), tos: null, sourceport: get_port(), sourcehost: get_ip(), duration: get_duration() , destinationhost: get_ip(), destinationport: get_port(), protocol: get_protocol(), flags: get_flags(), packets: get_packets()};
docs[i]=doc;
}
I chose e.g. amount = 10.000.000.
all functions look like:
function get_flags( )
{
var tmpstring= Math.floor((Math.random()*8)+1);
return tmpstring;
}
How does such an error occurs? How can I solve that problem?
How does such an error occur? The docs array needs memory so adding 10million entries would mean using up (e.g.) 100x10million bytes (if each doc entry is 100 bytes) which is 1GB of memory.
Proposed solution: Maybe try running the generate-insert cycle in batches of say 1000 entries. So generate 1000 docs, save them and reuse the array for next 1000 docs and so on.

Categories

Resources