Stuck script when saving stringified object using big json library - javascript

In my app I have a big object (>600Mb) and I'd like to stringify it and save it in a json file - because of it's size I'd like to use big-json library and json.createStringifyStream method which returns a stringified object stream. My script looks like this:
import json from 'big-json'
const fetchAt = new Date();
const myData = // big object >600MB
const myDataFileWriteStream = fs.createWriteStream('./myfile.json');
json.createStringifyStream({body: myData})
.pipe(myDataFileWriteStream)
.on('finish', function () {
const writeAt = new Date();
console.log(`Data written in ${(writeAt - fetchAt) / 1000} seconds.`); // this line is never printed out
});
When I run it I can see that it's saving the data for some time - After that it freezes and stops doing anything - but the script doesn't finish:
It doesn't print the log hooked on the finish event
The file is much smaller than I anticipate
The script doesn't exit
I run it for like 30 minutes and no sign of finishing - it uses 0 CPU. Do you spot any problem ? What might be the cause ?
Edit:
I have a working example - please clone https://github.com/solveretur/big-json-test and run npm start . After ~15 minutes it should freeze and won't finish - I use node#14

Related

why gzip is not downloading fast on browser( how can i solve it)?

Im having issue of download time of gzip files on the browser . im really not sure why its take too long to donwload these files. its taking more than 30 seconds to donwload all .gz files. how can i solve it on js code? on local server the files donwload really fast, but when i try on live server it takes more than 30 seconds. In total i have 12 gzip files.
and here i will place some part of the code im using.
import Kuroshiro from "kuroshiro"
import KuromojiAnalyzer from "../node_modules/kuroshiro-analyzer-kuromoji/dist/kuroshiro-analyzer-kuromoji"
async function furikuro(){
console.time('load speed');
const kuroshiro = new Kuroshiro();
// Initialize
// Here uses async/await, you could also use Promise
await kuroshiro.init(new KuromojiAnalyzer({ dictPath: "/wordpress/wp-content/themes/colorssite/node_modules/kuromoji/dict/" }));
// Convert what you want
const result = await kuroshiro.convert("甘い感じ取れたら手を繋ごう、重なるのは人生のライン and レミリア最高!", { to: "hiragana" });
console.log(result);
console.timeEnd('load speed');
}
furikuro();

Is it possible to write text in the middle of a file with fs.createWriteStream ? (or in nodejs in general)

I'm trying to write in a text file, but not at the end like appendFile() do or by replacing the entiere content...
I saw it was possible to chose where you want to start with start parameter of fs.createwritestream() -> https://nodejs.org/api/fs.html#fs_fs_createwritestream_path_options
But there is no parameter to say where to stop writting, right ? So it remove all the end of my file after I wrote with this function.
const fs = require('fs');
var logger = fs.createWriteStream('result.csv', {
flags: 'r+',
start: 20 //start to write at the 20th caracter
})
logger.write('5258,525,98951,0,1\n') //example a new line to write
Is there a way to specify where to stop writting in the file to have something like:
....
data from begining
....
5258,525,98951,0,1
...
data till the end
...
I suspect you mean, "Is it possible to insert in the middle of the file." The answer to that is: No, it isn't.
Instead, to insert, you have to:
Determine how big what you're inserting is
Copy the data at your insertion point to that many bytes later in the file
Write your data
Obviously when doing #2 you need to be sure that you're not overwriting data you haven't copied yet (either by reading it all into memory first or by working in blocks, from the end of the file toward the insertion point).
(I've never looked for one, but there may be an npm module out there that does this for you...)
You could read/parse your file at first. Then apply the modifications and save the new file.
Something like:
const fs = require("fs");
const fileData = fs.readFileSync("result.csv", { encoding: "utf8" });
const fileDataArray = fileData.split("\n");
const newData = "5258,525,98951,0,1";
const index = 2; // after each row to insert your data
fileDataArray.splice(index, 0, newData); // insert data into the array
const newFileData = fileDataArray.join("\n"); // create the new file
fs.writeFileSync("result.csv", newFileData, { encoding: "utf8" }); // save it

How do I save an array to a file and manipulate it from within my code?

This is in p5.js which includes most javascript functions!
I am trying to make a save-file for my game. By this I mean: the user presses the save button in my game. It updates an array that is saved on a file included in the game package, the player keeps playing. How would I do something like this (creating files that can be accessed by my code and changed).
var SM = {
//save files
sf1: [1,0,0,0,0],
[0,0,0,0,0],
[0,0,0,0,0],
sf2: [1,0,0,0,0],
[0,0,0,0,0],
[0,0,0,0,0],
sf3: [1,0,0,0,0],
[0,0,0,0,0],
[0,0,0,0,0],
};
One more thing (FOR PROCESSING CODERS FROM HERE ON): I tried to use processing functions like saveStrings(); and loadStrings(); but I couldn't get saveStrings() to save to a specific location nor could I properly load a txt file. Here is the code I used for that:
var result;
function preload() {
result = loadStrings('assets/nouns.txt');
}
function setup() {
background(200);
var ind = floor(random(result.length));
text(result[ind], 10, 10, 80, 80);
}
I had a folder called assets within the sketch folder and assets had a txt file called nouns with strings in it (downloaded from saveStrings then manually moved) but the sketch wont go past the loading screen?
If you are running it from a browser, you can't save or load a file how you want, period. Saving and loading files in browser JavaScript involves user interaction, and they get to pick the file and where it saves.
If you want to save it locally, instead of trying to write it to a file, you should write and read it from localStorage, which you can then do just fine.
// save
localStorage.setItem('saveData', data);
// load
const data = localStorage.getItem('saveData');
If it is somehow a game run directly on the client (out of the browser), like written in Node.js, then you'd want to use the fs functions.
To expand a bit, if you have your save data as an object:
const saveData = {
state: [1,2,3],
name: 'player'
};
Then to save it, you would simply call:
localStorage.setItem('saveData', JSON.stringify(data));
You'll want to stringify it when you save it to make it work properly. To read it back, you can then just read it back with getItem():
const data = JSON.parse(localStorage.getItem('saveData') || '{}');
(That extra || '{}' bit will handle if it hasn't been saved before and give you an empty object.)
It's actually much easier than trying to write a JavaScript file that you would then read in. Even if you were writing a file, you'd probably want to write it as JSON, not JavaScript.
In order to save strings into a file in Javascript, I would recommand you this previous StackOverflow question, which provides a link to a very clear and easy-to-use library to manage files in Javascript.

Fast rising memory when starting webworker in node.js with "big" data

I have the typical code to start a webworker in node:
var Threads = require('webworker-threads');
var worker = new Threads.Worker(__dirname + '/workers/myworker.js');
worker.onmessage = function (event) {
// 1.
// ... create and execute cypher query ...
};
// Start the worker.
worker.postMessage({
'data' : data
});
At 1. I send small pieces of processed data to a Neo4J db.
For small data this works perfectly fine, but when the data gets slightly bigger node/the worker starts to struggle.
The actual data I want to process is a csv I parsed with BabyParse resulting in an object with 149000 properties where each has another 17 properties. (149000 rows by 17 columns = 2533000 properties). The file is 17MB.
When doing this node will allocate a lot of memory and eventually crash around 53% memory allocation. The machine has 4GB.
The worker looks roughly like this:
self.onmessage = function (event) {
process(event.data.data);
};
function process(data) {
for (var i = 0; i < data.length; i++) {
self.postMessage({
'properties' : data[i]
});
}
}
I tried to chunk the data and process it chunkwise within the worker which also works okay. But I want to generate a graph and to process the edges I need the complete data, because I need to check every row (vertex) against all others.
Is there a way to stream the data into the worker? Or does anyone have an idea why node allocates so much memory with 17MB of data being send?
Instead of parsing the data in the main thread you can also pass the filename as a message to the worker and have the worker load it from disk. Otherwise you will have all the data in memory twice, once in the host and once in the worker.
A different option would be to use the csv npm package with the streaming parser. postMessage the lines as they come in and buffer them up till the final result in the worker.
Why your solution tries to allocate those enormous amounts of memory I don't know. I do know postMessage is intended to pass small messages.

MarkLogic JavaScript scheduled task

I try to schedule a script using the 'Scheduled Tasks' in ML8. The documentation explains this a bit but only for xQuery.
Now I have a JavaScript file I'd like to schedule.
The error in the log file:
2015-06-23 19:11:00.416 Notice: TaskServer: XDMP-NOEXECUTE: Document is not of executable mimetype. URI: /scheduled/cleanData.js
2015-06-23 19:11:00.416 Notice: TaskServer: in /scheduled/cleanData.js [1.0-ml]
My script:
/* Scheduled script to delete old data */
var now = new Date();
var yearBack = now.setDate(now.getDate() - 65);
var date = new Date(yearBack);
var b = cts.jsonPropertyRangeQuery("Dtm", "<", date);
var c = fn.subsequence(cts.uris("", [], b), 1, 10);
while (true) {
var uri = c.next();
if (uri.done == true){
break;
}
xdmp.log(uri.value, "info"); // log for testing
}
Try the *.sjs extension (Server-side JavaScript).
The *.js extension can be used for static JavaScript resources to return to the client instead of executed on the server.
Hoping that helps,
I believe that ehennum found the issue for you (the extension - which is what the mime-type error is complaining about.
However, on the same subject, not all items in ML work quite as you would expect for Serverside Javascript. For example, using sjs as a target of a trigger is (or recently) did not work. So for things like that, it is also possible to wrap the sjs call inside of xqy using xdmp-invoke.

Categories

Resources