I'm a newbee to java-script & nodejs...
I have a file processing-stream which is composed of multiple streams pipped together. Works great...I'd like to enhance this by conditionally stopping the stream from processing and reading when a threshold has been reached.
inputStream.
pipe(unzip()).
pipe(latin1Decoder).
pipe(
combine(
through(function(obj){
//part-1 -
this.push(obj);
}),
through(function(obj){
//part-2-
this.push(obj);
})
));
In part-1 if I do this.push(null) then the combiner will ignore incoming input.
However, I can't stop reading the file. That's annoying because the file is pretty huge.
What is the best way for me to access the input stream from inside the pipe-line to close it?
OK, here is how I ended up solving it:
var cnt = 0;
var processingStream = combine(unzip(), latin1Decoder(), part1(), part2());
inputStream.pipe(processingStream);
inputStream.on('data', function(x) {
var lineCnt = x.toString().split(/\r\n|\r|\n/).length;
cnt += lineCnt;
if (cnt > 5) {
inputStream.close();
inputStream.unpipe(processingStream);
processingStream.end();
}
});
Probably not the most efficient way, but it meets my requirements.
Please note that inputstream reads in blocks so multiple lines are read at once.
Related
I'm just new in programming and now self-studying how to use createStream. I'm kind of lost how to use stream in nodejs using JS. Basically, what I wanted to do is to read a JSON file (more than 1GB) which have a massive array of object. Update the existing value of a certain object or add another set of object. I able to do it using the normal read, update or add function then write. Problem is I'm getting a large spike in RAM usage.
My code is like this:
const fs = require(`fs-extra`);
async function updateOrAdd () {
var datafile = await fs.readJson(`./bigJSONfile.json`);
var tofind = {user:alexa, age:21,country: japan, pending: 1, paid: 0};
foundData = datafile.filter(x => x.user === tofind.user && x.country === tofind.country);
if (foundData === null){
datafile = datafile.concat(tofind)
} else {
foundData.pending += 1
foundData.paid += 1
}
await fs.writeJson(`./bigJSONfile.json`, datafile)
}
I saw some codes for reference in createStream and they say pipe is the most efficient way for memory usage. Though, mostly what I saw is like making a copy only from the original one.
I really appreciate it if anyone can teach me how to do this using stream or if you can provide me the code for it :).
For some reason I got stuck with some events on jQuery/JS
function update()
{
if(scrolling == true) {
return;
}
var count = 0;
jQuery.get('count.txt', function(data) {
count = data;
}).done(function() {
var countstr = '' + count;
myImage.src = "latest" + countstr + ".jpg#" + new Date().getTime();
setTimeout(update, 1000);
});
}
In my last question I asked about the jQuery "done function"
Currently I am working with a Timeout/timer to update the image every second
setTimeout(update, 1000);
It does work but I know that this can't be the smartest solution. In C# I'm able to use a FileWatcher to use an event to check if there is a new file in the folder
FileSystemWatcher watcher = new FileSystemWatcher();
watcher.Path = path;
watcher.NotifyFilter = NotifyFilters.LastAccess | NotifyFilters.LastWrite
| NotifyFilters.FileName | NotifyFilters.DirectoryName;
watcher.Filter = "*.jpg";
watcher.Created += new FileSystemEventHandler(OnChanged);
watcher.EnableRaisingEvents = true;
Is there an API or an event for jQuery/JS to check that? I was also looking to work with AJAX but I got no experiences with AJAX.
//edit
I know that JS is not able to do that. But I was just wondering if there is another way to use this event (like AJAX or Node.js)
What Am I doing?
I made a software which will upload many images on my ftp server. images0, images1, images2 etc.
The event should check if there was another image uploaded and should show this instead of the old image
Florian, as it was already mentioned, you cannot do it with client JS code.
What would I use in this case (I assume it's the universal solution):
NodeJS has file watching API (https://nodejs.org/docs/latest/api/fs.html#fs_class_fs_fswatcher), thus, you can subscribe to FS events.
You should notify client about this changes. I would use soket.io ( https://socket.io/ , both client and server side).
Using the file watcher and websokets you can notify user about any FS changes. You can upload files using FTP, HTTP client or just create them locally.
Clientside/Frontend Languages won't able to create/edit/delete a File
It can only read a File
for writefile in node js ..its already in stackoverflow refer Writing files in Node.js
Suppose I have a readable stream, e.g. request(URL). And I want to write its response on the disk via fs.createWriteStream() and piping with the request. But at the same time I want to calculate a checksum of the downloading data via crypto.createHash() stream.
readable -+-> calc checksum
|
+-> write to disk
And I want to do it on the fly, without buffering an entire response in memory.
It seems that I can implement it using oldschool on('data') hook. Pseudocode below:
const hashStream = crypto.createHash('sha256');
hashStream.on('error', cleanup);
const dst = fs.createWriteStream('...');
dst.on('error', cleanup);
request(...).on('data', (chunk) => {
hashStream.write(chunk);
dst.write(chunk);
}).on('end', () => {
hashStream.end();
const checksum = hashStream.read();
if (checksum != '...') {
cleanup();
} else {
dst.end();
}
}).on('error', cleanup);
function cleanup() { /* cancel streams, erase file */ };
But such approach looks pretty awkward. I tried to use stream.Transform or stream.Writable to implement something like read | calc + echo | write but I'm stuck with the implementation.
Node.js readable streams have a .pipe method which works pretty much like the unix pipe-operator, except that you can stream js objects as well as just strings of some type.
Here's a link to the doc on pipe
An example of the use in your case could be something like:
const req = request(...);
req.pipe(dst);
req.pipe(hash);
Note that you still have to handle errors per stream as they're not propagated and the destinations are not closed if the readable errors.
I'm processing a very large amount of data that I'm manipulating and storing it in a file. I iterate over the dataset, then I want to store it all in a JSON file.
My initial method using fs, storing it all in an object then dumping it didn't work as I was running out of memory and it became extremely slow.
I'm now using fs.createWriteStream but as far as I can tell it's still storing it all in memory.
I want the data to be written object by object to the file, unless someone can recommend a better way of doing it.
Part of my code:
// Top of the file
var wstream = fs.createWriteStream('mydata.json');
...
// In a loop
let JSONtoWrite = {}
JSONtoWrite[entry.word] = wordData
wstream.write(JSON.stringify(JSONtoWrite))
...
// Outside my loop (when memory is probably maxed out)
wstream.end()
I think I'm using Streams wrong, can someone tell me how to write all this data to a file without running out of memory? Every example I find online relates to reading a stream in but because of the calculations I'm doing on the data, I can't use a readable stream. I need to add to this file sequentially.
The problem is that you're not waiting for the data to be flushed to the filesystem, but instead keep throwing new and new data to the stream synchronously in a tight loop.
Here's an piece of pseudocode that should work for you:
// Top of the file
const wstream = fs.createWriteStream('mydata.json');
// I'm no sure how're you getting the data, let's say you have it all in an object
const entry = {};
const words = Object.keys(entry);
function writeCB(index) {
if (index >= words.length) {
wstream.end()
return;
}
const JSONtoWrite = {};
JSONtoWrite[words[index]] = entry[words[index]];
wstream.write(JSON.stringify(JSONtoWrite), writeCB.bind(index + 1));
}
wstream.write(JSON.stringify(JSONtoWrite), writeCB.bind(0));
You should wrap your data source in a readable stream too. I don't know what is your source, but you have to make sure, it does not load all your data in memory.
For example, assuming your data set come from another file where JSON objects are splitted with end of line character, you could create a Read stream as follow:
const Readable = require('stream').Readable;
class JSONReader extends Readable {
constructor(options={}){
super(options);
this._source=options.source: // the source stream
this._buffer='';
source.on('readable', function() {
this.read();
}.bind(this));//read whenever the source is ready
}
_read(size){
var chunk;
var line;
var lineIndex;
var result;
if (this._buffer.length === 0) {
chunk = this._source.read(); // read more from source when buffer is empty
this._buffer += chunk;
}
lineIndex = this._buffer.indexOf('\n'); // find end of line
if (lineIndex !== -1) { //we have a end of line and therefore a new object
line = this._buffer.slice(0, lineIndex); // get the character related to the object
if (line) {
result = JSON.parse(line);
this._buffer = this._buffer.slice(lineIndex + 1);
this.push(JSON.stringify(line) // push to the internal read queue
} else {
this._buffer.slice(1)
}
}
}}
now you can use
const source = fs.createReadStream('mySourceFile');
const reader = new JSONReader({source});
const target = fs.createWriteStream('myTargetFile');
reader.pipe(target);
then you'll have a better memory flow:
Please note that the picture and the above example are taken from the excellent nodejs in practice book
I'm trying to write a string to a socket (socket is called "response"). Here is the code I have sofar (I'm trying to implement a byte caching proxy...):
var http = require('http');
var sys=require('sys');
var localHash={};
http.createServer(function(request, response) {
var proxy = http.createClient(80, request.headers['host'])
var proxy_request = proxy.request(request.method, request.url, request.headers);
proxy_request.addListener('response', function (proxy_response) {
proxy_response.addListener('data', function(x) {
var responseData=x.toString();
var f=50;
var toTransmit="";
var p=0;
var N=responseData.length;
if(N>f){
p=Math.floor(N/f);
var hash="";
var chunk="";
for(var i=0;i<p;i++){
chunk=responseData.substr(f*i,f);
hash=DJBHash(chunk);
if(localHash[hash]==undefined){
localHash[hash]=chunk;
toTransmit=toTransmit+chunk;
}else{
sys.puts("***hit"+chunk);
toTransmit=toTransmit+chunk;//"***EOH"+hash;
}
}
//remainder:
chunk=responseData.substr(f*p);
hash=DJBHash(chunk);
if(localHash[hash]==undefined){
localHash[hash]=chunk;
toTransmit=toTransmit+chunk;
}else{
toTransmit=toTransmit+chunk;//"***EOH"+hash;
}
}else{
toTransmit=responseData;
}
response.write(new Buffer(toTransmit)); /*error occurs here */
});
proxy_response.addListener('end', function() {
response.end();
});
response.writeHead(proxy_response.statusCode, proxy_response.headers);
});
request.addListener('data', function(chunk) {
sys.puts(chunk);
proxy_request.write(chunk, 'binary');
});
request.addListener('end', function() {
proxy_request.end();
});
}).listen(8080);
function DJBHash(str) {
var hash = 5381;
for(var i = 0; i < str.length; i++) {
hash = (((hash << 5) + hash) + str.charCodeAt(i)) & 0xffffffff;
}
if(hash<-1){
hash=hash*-1;
}
return hash;
}
The trouble is, I keep getting a "content encoding error" in Firefox. It's as if the gizipped content isn't being transmitted properly. I've ensured that "toTransmit" is the same as "x" via console.log(x) and console.log(toTransmit).
It's worth noting that if I replace response.write(new Buffer(toTransmit)) with simply response.write(x), the proxy works as expected, but I need to do some payload analysis and then pass "toTransmit", not "x".
I've also tried to response.write(toTransmit) (i.e. without the conversion to buffer) and I keep getting the same content encoding error.
I'm really stuck. I thought I had this problem fixed by converting the string to a buffer as per another thread (http://stackoverflow.com/questions/7090510/nodejs-content-encoding-error), but I've re-opened a new thread to discuss this new problem I'm experiencing.
I should add that if I open a page via the proxy in Opera, I get gobblydeegook - it's as if the gzipped data gets corrupted.
Any insight greatly appreciated.
Many thanks in advance,
How about this?
var responseData = Buffer.from(x, 'utf8');
from: Convert string to buffer Node
Without digging very deep into your code, it seems to me that you might want to change
var responseData=x.toString();
to
var responseData=x.toString("binary");
and finally
response.write(new Buffer(toTransmit, "binary"));
From the docs:
Pure Javascript is Unicode friendly but not nice to binary data. When
dealing with TCP streams or the file system, it's necessary to handle
octet streams. Node has several strategies for manipulating, creating,
and consuming octet streams.
Raw data is stored in instances of the Buffer class. A Buffer is
similar to an array of integers but corresponds to a raw memory
allocation outside the V8 heap. A Buffer cannot be resized.
So, don't use strings for handling binary data.
Change proxy_request.write(chunk, 'binary'); to proxy_request.write(chunk);.
Omit var responseData=x.toString();, that's a bad idea.
Instead of doing substr on a string, use slice on a buffer.
Instead of doing + with strings, use the "concat" method from the buffertools.
Actually, now new Buffer() is deprecated sence node.js v10+, so better to use
Buffer.from(,)
from
response.write(new Buffer(toTransmit));
do
response.write(Buffer.from(toTransmit,'binary'));