How to process millions in a file inside Mongo shell with Javascript?

How to process millions in a file inside Mongo shell with Javascript? - javascript

Lets say, my file has following content and situated at /home/usr1/Documents/companyNames.txt
Name1
Name 2
Name 3
Millions of names...
I tried the following code:
$> var string = cat('home/usr1/Documents/companyNames.txt');
$> string = string.split('\n');
$> db.records.find({field: {$in: string}});
As per the code in link Can I read a csv file inside of a Mongo Shell Javascript file?
This works if the file is small in size, but when the file has millions of lines it fails. The whole lines in the file is trying to fit inside the memory and gets crashed. Is there any other way to process big files inside Mongo shell with Java script?

Mongo isn't very good with larges queries.
You might have to go the Javascript way :
var string = cat('home/usr1/Documents/companyNames.txt');
string = string.split('\n');
let results = [];
string.forEach(string => result.push(db.records.find({field: {$eq: string}})));

Related

Is it possible for testcafe to read a variable from another file?

I want to create some automation so the user would have to place a file with some details like their address and then my automation script would pick it up.
I'm comfortable with using the const command in TestCafe to pick up local variables but is there a way if i put a file in like lets say address.txt for it to pick up line1 line2 line3, etc?
So i see the code like this:
const line1;
const line2;
const line3;
and in Address.txt we'd have
$line1 = '1 hello world way'
$line2 = 'line 2'
$line3= 'line 3'
What am i missing to plug it all together? Thanks,

You can use the native facilities of NodeJS. Please refer to the example from our documentation for more details.

This would be better if done through json. Here is a high level approach you could take:
Keep an excel file to get data from users (say, address).
Use xlsx-to-json to convert excel data to json.
Refer this json in your script: const dataFile = require('./data.json');
Parse json and do whatever is needed.

Pentaho/Kettle - Javascript or java that gets file names older than a specified date

Please excuse the rookie question as I'm not a programmer :)
We're using Pentaho 8
I'm looking for a way to have Javascript or Java read a directory and return the file names of any files that are older than a date that will be provided by a Pentaho parameter.
Here is what I currently have using a Modified Java Script Value step that only lists the directory contents:
var _getAllFilesFromFolder = function(dir) {
var filesystem = require("fs");
var results = [];
filesystem.readdirSync(dir).forEach(function(file) {
file = dir+'\'+file;
var stat = filesystem.statSync(file);
if (stat && stat.isDirectory()) {
results = results.concat(_getAllFilesFromFolder(file))
} else results.push(file);
});
return results;
};
Is Javascript/Java the right way to do this?

There's a step called "Get file names". You just need to provide the path you want to poll. It also allows doing so recursively, only showing filenames that match a given filter, and in the filters tab allow you to show only folders, only files, or both.

nsousa's answer would be the easiest, then after you get your file list you can use a filter rows step on the lastmodifiedtime returned from the Get file names. 2 -steps, 3 if you want to format the date/time returned to something easier to sort/filter through. This is the approach I use and its is faster then the transformations can keep up with generally.

Import JSON with mongo script

I'm trying to write a mongo script to import a jsonArray from a JSON file. My script is in .js format and I execute it with load() command in mongo shell. Is it possible to do it with a mongo script?
I know I can use mongoimport instead. But I want to know a way to do it with a script.
The contents of my current script in which the import part is missing is given below..
var db = connect("localhost:27017/fypgui");
//Import json to "crimes" collection here
var crimes = db.crimes.find();
while (crimes.hasNext()){
var item = crimes.next();
var year =(item.crime_date != null)?(new Date(item.crime_date)).getFullYear():null;
db.crimes.update( {_id: item._id}, {$set: {crime_year: year}});
}

There is another answer to this question. Even though it's a bit old I am going to respond.
It is possible to do this with the mongo shell.
You can convert your JSON to valid JavaScript by prefixing it with var myData= then use the load() command to load the JavaScript. After the load() you will be able to access your data from within your mongo script via the myData object.
data.js
var myData=
[
{
"letter" : "A"
},
{
"letter" : "B"
},
{
"letter" : "C"
}
]
read.js
#!/usr/bin/mongo --quiet
// read data
load('data.js');
// display letters
for (i in myData) {
var doc = myData[i];
print(doc.letter);
}
For writing JSON is easiest to just load your result into a single object. Initialize the object at the beginning with var result={} and then use printjson() at the end to output. Use standard redirection to output the data to a file.
write.js
#!/usr/bin/mongo --quiet
var result=[];
// read data from collection etc...
for (var i=65; i<91; i++) {
result.push({letter: String.fromCharCode(i)});
}
// output
print("var myData=");
printjson(result);
The shebang lines (#!) will work on a Unix type operating system (Linux or MacOs) they should also work on Windows with Cygwin.

It is possible to get file content as text using undocumented cat() function from the mongo shell:
var text = cat(filename);
If you are interested cat() and other undocumented utilities such as writeFile are defined in this file: shell_utils_extended.cpp
Having file's content you can modify it as you wish or directly pass it to JSON.parse to get JavaScript object:
jsObj = JSON.parse(text);
But be careful: unfortunately JSON.parse is not an equivalent of mongoimport tool in the sense of its JSON parsing abilities.
mongoimport is able to parse Mongo's extended JSON in canonical format. (Canonical format files are created by bsondump and mongodump for example. For more info on JSON formats see MongoDB extended JSON).
JSON.parse does not support canonical JSON format. It will read canonical format input and will return JavaScript object but extended data type info present in canonical format JSON will be ignored.

No, the mongo shell doesn't have the capability to read and write from files like a fully-fledged programming environment. Use mongoimport, or write the script in a language with an official driver. Node.js will have syntax very close to the mongo shell, although Node.js is an async/event-driven programming environment. Python/PyMongo will be similar and easy to learn if you don't want to deal with structuring the logic to use callbacks.

Hey I know this is not relevent but, every time I need to import some jsons to my mongo db I do some shity copy, paste and run, until I had enough!!!
If you suffer the same I v written a tiny batch script that does that for me. Intrested?
https://github.com/aminjellali/batch/blob/master/mongoImporter.bat
#echo off
title = Mongo Data Base importing tool
goto :main
:import_collection
echo importing %~2
set file_name=%~2
set removed_json=%file_name:.json=%
mongoimport --db %~1 --collection %removed_json% --file %~2
goto :eof
:loop_over_files_in_current_dir
for /f %%c in ('dir /b *.json') do call :import_collection %~1 %%c
goto :eof
:main
IF [%1]==[] (
ECHO FATAL ERROR: Please specify a data base name
goto :eof
) ELSE (
ECHO #author amin.jellali
ECHO #email a.j.amin.jellali#gmail.com
echo starting import...
call :loop_over_files_in_current_dir %~1
echo import done...
echo hope you enjoyed me
)
goto :eof

How go I get csv data into netsuite?

I've got an update to my question.
What I really wanted to know was this:
How do I get csv data into netsuite?
Well, it seems I use the csv import tool to create a mapping and use this call to import the csv nlapiSubmitCSVImport(nlobjCSVImport).
Now my question is: How do I iterate through the object?!
That gets me half way - I get the csv data but I can't seem to find out how I iterate through it in order to manipulate the date. This is, of course, the whole point of a scheduled script.
This is really driving me mad.
#Robert H
I can think of a million reasons why you'd want to import data from a CSV. Billing, for instance. Various reports on data any company keeps and I wouldn't want to keep this in the file cabinet nor would I really want to keep the file at all. I just want the data. I want to manipulate it and I want to enter it.

Solution Steps:
To upload a CSV file we have to use a Suitelet script.
(Note: file - This field type is available only for Suitelets and will appear on the main tab of the Suitelet page. Setting the field type to file adds a file upload widget to the page.)
var fileField = form.addField('custpage_file', 'file', 'Select CSV File');
var id = nlapiSubmitFile(file);
Let's prepare to call a Restlet script and pass the file id to it.
var recordObj = new Object();
recordObj.fileId = fileId;
// Format input for Restlets for the JSON content type
var recordText = JSON.stringify(recordObj);//stringifying JSON
// Setting up the URL of the Restlet
var url = 'https://rest.na1.netsuite.com/app/site/hosting/restlet.nl?script=108&deploy=1';
// Setting up the headers for passing the credentials
var headers = new Array();
headers['Content-Type'] = 'application/json';
headers['Authorization'] = 'NLAuth nlauth_email=amit.kumar2#mindfiresolutions.com, nlauth_signature=*password*, nlauth_account=TSTDRV****, nlauth_role=3';
(Note: nlapiCreateCSVImport: This API is only supported for bundle installation scripts, scheduled scripts, and RESTlets)
Let's call the Restlet using nlapiRequestURL:
// Calling Restlet
var output = nlapiRequestURL(url, recordText, headers, null, "POST");
Create a mapping using Import CSV records available at Setup > Import/Export > Import CSV records.
Inside the Restlet script Fetch the file id from the Restlet parameter. Use nlapiCreateCSVImport() API and set its mapping with mapping id created in step 3. Set the CSV file using the setPrimaryFile() function.
var primaryFile = nlapiLoadFile(datain.fileId);
var job = nlapiCreateCSVImport();
job.setMapping(mappingFileId); // Set the mapping
// Set File
job.setPrimaryFile(primaryFile.getValue()); // Fetches the content of the file and sets it.
Submit using nlapiSubmitCSVImport().
nlapiSubmitCSVImport(job); // We are done
There is another way we can get around this although neither preferable nor would I suggest. (As it consumes a lot of API's if you have a large number of records in your CSV file.)
Let's say that we don't want to use the nlapiCreateCSVImport API, so let's continue from the step 4.
Just fetch the file Id as we did earlier, load the file, and get its contents.
var fileContent = primaryFile.getValue();
Split the lines of the file, then subsequently split the words and store the values into separate arrays.
var splitLine = fileContent.split("\n"); // Splitting the file on the basis of lines.
for (var lines = 1,count=0; lines < splitLine.length; lines++)
{
var words = (splitLine[lines]).split(","); // words stores all the words on a line
for (var word = 0; word < words.length; word++)
{
nlapiLogExecution("DEBUG", "Words:",words[word]);
}
}
Note: Make sure you don't have an additional blank line in your CSV file.
Finally create the record and set field values from the array that we created above.
var myRec = nlapiCreateRecord('cashsale'); // Here you create the record of your choice
myRec.setFieldValue('entity', arrCustomerId[i]); // For example, arrCustomerId is an array of customer ID.
var submitRec = nlapiSubmitRecord(myRec); // and we are done

fellow NetSuite user here, I've been using SuiteScripts for a while now but never saw nlobjCSVImport object nor nlapiSubmitCSVImport .. I looked in the documentation, it shows, but there is no page describing the details, care to share where you got the doc from?
With the doc for the CSVImport object I might be able to provide some more help.
P.S. I tried posting this message as a comment but the "Add comment" link didn't show up for some reason. Still new to SOF

CSV to JSON:
convert csv file to json object datatable
https://code.google.com/p/jquery-csv/
If you know the structure of the CSV file, just do a for loop and map the fields to the corresponding nlapiSetValue.
Should be pretty straightforward.

Read JavaScript variables from a file with Python

I use some Python scripts on server-side to compile JavaScript files to one file and also add information about these files into database.
For adding the information about the scripts I now have yaml file for each js file with some info inside it looking like this:
title: Script title
alias: script_alias
I would like to throw away yaml that looks redundant here to me if I can read these variables directly from JavaScript that I could place in the very beginning of the file like this:
var title = "Script title";
var alias = "script_alias";
Is it possible to read these variables with Python easily?

Assuming you only want the two lines, and they are at the top of the file...
import re
js = open("yourfile.js", "r").readlines()[:2]
matcher_rex = re.compile(r'^var\s+(?P<varname>\w+)\s+=\s+"(?P<varvalue>[\w\s]+)";?$')
for line in js:
matches = matcher_rex.match(line)
if matches:
name, value = matches.groups()
print name, value

Have you tried storing the variables in a JSON format? Then, both javascript and python can easily parse the data and get the variables.

Develop Reference

JavaScript is the programming language of the Web.

How to process millions in a file inside Mongo shell with Javascript? - javascript

Mongo isn't very good with larges queries. You might have to go the Javascript way : var string = cat('home/usr1/Documents/companyNames.txt'); string = string.split('\n'); let results = []; string.forEach(string => result.push(db.records.find({field: {$eq: string}})));

Related

Is it possible for testcafe to read a variable from another file?

Pentaho/Kettle - Javascript or java that gets file names older than a specified date

Import JSON with mongo script

How go I get csv data into netsuite?

Read JavaScript variables from a file with Python

Categories

Resources