node.js Transfer and saving files using TCP Server - javascript

I have a lot of devices sending messages to a TCP Server written in node. The main task of the TCP server is to route some of that messages to redis in order to be processed by another app.
I've written a simple server that does the job quite well. The structure of the code is basically this (not the actual code, details hidden):
const net = require("net");
net.createServer(socket => {
socket.on("data", buffer => {
const data = buffer.toString();
if (shouldRouteMessage(data)) {
redis.publish(data);
}
});
});
Most of the messages are like: {"text":"message body"}, or {"lng":32.45,"lat":12.32}. But sometimes I need to process a message like {"audio":"...encoded audio..."} that spans several "data" events.
What I need in this case is to save the encoded audio into a file and send to redis {"audio":"path/to/audio-file.mp3"} where the route is the file with the audio data received.
One simple option is to store the buffers until I detect the end of the message and then save all them to a file, but that means, among other things, that I must keep the file on memory before saving to disk.
I hope there are better options using streams and pipes. ¿Any suggestions? (some code examples, would be nice)
Thanks

I finally solved, so I post the solution here for documentation purposes (and with some luck, to help others).
The solution was, indeed, quite simple: just open a write stream to a file and write the data packets as they are received. Something like this:
const net = require("net");
const fs = require("fs");
net.createServer(socket => {
socket.on("data", buffer => {
let file = null;
let filePath = null;
const data = buffer.toString();
if (shouldRouteMessage(data)) {
// just publish the message
redis.publish(data);
} else if (isAudioStart(data)) {
// create a write stream to a file and write the first data packet
filePath = buildFilePath(data);
file = fs.createWriteStream(filePath);
file.write(data);
} else if (isLastFragment(data)) {
// if is the last fragment, write it, close the file and publish the result
file.write(data);
file.close();
redis.publish(filePath);
file = filePath = null;
} else if (isDataFragment(data)) {
// just write (stream) it to file
file.write(data);
}
});
});
Note: shouldRouteMessage, buildFilePath, isDataFragment, and isLastFragment are custom functions that depends on the kind of data.
In this way, the incoming data is streamed to the file directly and no need to save the contents in memory before. node's streams rocks!
As always the devil is in the details. Some checks are necesary to, for example, ensure there's always a file when you want to write it. Remember also to set the proper encoding when converting to string (for example: buffer.toString('binary'); did the trick for me). Depending on your data format, the shouldRouteMessage, isAudioStart... and all this custom functions can be more or less complex.
Hope it helps.

Related

How to use GCP DLP with a file stream

I'm working with Node.js and GCP Data Loss Prevention to attempt to redact sensitive data from PDFs before I display them. GCP has great documentation on this here
Essentially you pull in the nodejs library and run this
const fileBytes = Buffer.from(fs.readFileSync(filepath)).toString('base64');
// Construct image redaction request
const request = {
parent: `projects/${projectId}/locations/global`,
byteItem: {
type: fileTypeConstant,
data: fileBytes,
},
inspectConfig: {
minLikelihood: minLikelihood,
infoTypes: infoTypes,
},
imageRedactionConfigs: imageRedactionConfigs,
};
// Run image redaction request
const [response] = await dlp.redactImage(request);
const image = response.redactedImage;
So normally, I'd get the file as a buffer, then pass it to the DLP function like the above. But, I'm no longer getting our files as buffers. Since many files are very large, we now get them from FilesStorage as streams, like so
return FilesStorage.getFileStream(metaFileInfo1, metaFileInfo2, metaFileInfo3, fileId)
.then(stream => {
return {fileInfo, stream};
})
The question is, is it possible to perform DLP image redaction on a stream instead of a buffer? If so, how?
I've found some other questions that say you can stream with ByteContentItem and GCPs own documentation mentions "streams". But, I've tried passing the returned stream from .getFileStream into the above byteItem['data'] property, and it doesn't work.
So chunking the stream up into buffers of appropriate size is going to work best here. There seem to be a number of approaches to build buffers from a stream you can use here.
Potentially relevant: Convert stream into buffer?
(A native stream interface is a good feature request, just not yet there.)

How to convert Base64 string to PNG (currently using Jimp)?

I'm currently creating a real-time chat application. This is a web application that uses node.js for the backend and uses socket.io to connect back and forth.
Currently, I'm working on creating user profiles with profile pictures. These profile pictures will be stored in a folder called images/profiles/. The file will be named by the user's id. For example: user with the id 1 will have their profile pictures stored in images/profiles/1.png. Very self-explanatory.
When the user submits the form to change their profile picture, the browser JavaScript will get the image, and send it to the server:
form.addEventListener('submit', handleForm)
function handleForm(event) {
event.preventDefault(); // stop page from reloading
let profilePicture; // set variable for profile picture
let profilePictureInput = document.getElementById('profilePictureInput'); // get image input
const files = profilePictureInput.files[0]; // get input's files
if (files) {
const fileReader = new FileReader(); // initialize file reader
fileReader.readAsDataURL(files);
fileReader.onload = function () {
profilePicture = this.result; // put result into variable
socket.emit("request-name", {
profilePicture: profilePicture,
id: userID,
}); // send result, along with user id, to server
}
}
I've commented most of the code so it's easy to follow. The server then gets this information. With this information, the server is supposed to convert the sent image to a png format (I can do whatever format, but it has to be the same format for all images). I am currently using the jimp library to do this task, but it doesn't seem to work.
const jimp = require('jimp'); // initialize Jimp
socket.on('request-name', (data) => { // when request has been received
// read the buffer from image (I'm not 100% sure what Buffer.from() does, but I saw this online)
jimp.read(Buffer.from(data.profilePicture), function (error, image) {
if (error) throw error; // throw error if there is one
image.write(`images/profiles/${data.id}.png`); // write image to designated place
}
});
The error I get:
Error: Could not find MIME for Buffer <null>
I've scoured the internet for answers but was unable to find any. I am available to use another library if this helps. I can also change the file format (.png to .jpg or .jpeg, if needed; it just needs to be consistent with all files). The only things I cannot change are the use of JavaScript/Node.js and socket.io to send the information to the server.
Thank you in advance. Any and all help is appreciated.
If you're just getting the data URI as a string, then you can construct a buffer with it and then use the built in fs to write the file. Make sure the relative path is accurate.
socket.on('request-name', data => {
const imgBuffer = Buffer.from(data.profilePicture, 'base64');
fs.writeFile(`images/profiles/${data.id}.png`, imgBuffer);
}

Is it possible to get an image from a message and add it to a folder in Discord.js?

Like is it possible to do something like !image add [image file] and then add the attachment to a folder? I think i can do that with fs, but i'm not sure how
You can use the fs function fs.writeFile() or fs.writeFileSync(). This function accepts the absolute path to a file to write to, and the data to write. In your case, it should be a buffer or stream.
// const fs = require('fs');
fs.writeFileSync('./some_dir/some_file_name.extension', data);
To get the data in question, you should access Message#attachments(), a collection of all attachments on the message. Assuming you only want the first, you can use Collection#first() to narrow down the results.
const attachment = message.attachments.first();
if (!attachment) {
// maybe place in some error handling
}
Unfortunately, the MessageAttachment class doesn't actually hold a buffer/stream representing the attachment, only the URL leading to it. This means you'll need a third-party library such as axios or node-fetch.
// const fetch = require('node-fetch');
fetch(attachment.url)
.then(res => res.buffer())
.then(buffer => {
fs.writeFileSync(`./images/${attachment.name}`, buffer);
});
Make sure to validate that URL to make sure it's an image!
if(!/\.(png|jpe?g|svg)$/.test(attachment.url)) {
// this attachment isn't an image!
// we don't want to be downloading .exe files now, do we?
}
Finally, you should also be weary that if two files are named the same, such as image.png, trying to write the second one will overwrite the first. One way to overcome that issue is to add numerical suffixes to duplicates, such as image.png, image-1.png, image-2.png, etc. That could work out a little like this:
fetch(attachment.url)
.then(res => res.buffer())
.then(buffer => {
let path = `./images/${attachment.name}`;
// increment the suffix every iteration until a file
// by the same name cannot be found
for (let count = 1; fs.existsSync(path); count++) {
path = `./images/${attachment.name}-${count}`;
}
fs.writeFileSync(path, buffer);
});

How can I get multiple files to upload to the server from a Javascript page without skipping?

I'm working on a research experiment which uses getUserMedia, implemented in recorder.js, to record .wav files from the user's microphone and XMLHttpRequest to upload them to the server. Each file is about 3 seconds long and there are 36 files in total. The files are recorded one after another and sent to the server as soon as they are recorded.
The problem I'm experiencing is that not all of the files end up on the server. Apparently the script or the php script are unable to catch up with all the requests in a row. How can I make sure that I get all the files? These are important research data, so I need every recording.
Here's the code that sends the files to the server. The audio data is a blob:
var filename = subjectID + item__number;
xhr.onload=function(e) {
if(this.readyState === 4) {
console.log("Server returned: ",e.target.responseText);
}
};
var fd=new FormData();
fd.append("audio_data",blob, filename);
xhr.open("POST","upload_wav.php",true);
xhr.send(fd);
And this is the php file on the server side:
print_r($_FILES);
$input = $_FILES['audio_data']['tmp_name'];
$output = "audio/".$_FILES['audio_data']['name'].".wav";
move_uploaded_file($input, $output)
This way of doing things is basically copied from this website:
Using Recorder.js to capture WAV audio in HTML5 and upload it to your server or download locally
I have already tried making the XMLHttpRequest wait by using
while (xhr.readyState != 4)
{
console.log("Waiting for server...")
}
It just caused the page to hang.
Would it be better to use ajax than XMLHttp Request? Is there something I can do to make sure that all the files get uploaded? I'm pretty new to Javascript so code examples are appreciated.
I have no idea what your architecture looks like, but here is a potential solution that will work to solve your problem.
The solution uses the Web Worker API to off load the file uploading to a sub-process. This is done with the Worker Interface of that API. This approach will work because there is no contention of the single thread of the main process - web workers work in their own processes.
Using this approach, we do three basic things:
create a new worker passing a script to execute
pass messages to the worker for the worker to deal with
pass messages back to the main process for status updates/replies/resolved data transformation/etc.
The code is heavily commented below to help you understand what is happening and where.
This is the main JavaScript file (script.js)
// Create a sub process to handle the file uploads
///// STEP 1: create a worker and execute the worker.js file immediately
let worker = new Worker('worker.js');
// Ficticious upload count for demonstration
let uploadCount = 12;
// repeatedly build and send files every 700ms
// This is repeated until uplaodCount == 0
let builder = setInterval(buildDetails, 700);
// Recieve message from the sub-process and pipe them to the view
///// STEP 2: listen for messages from the worker and do something with them
worker.onmessage = e => {
let p = document.createElement('pre');
// e.data represents the message data sent from the sub-process
p.innerText = e.data;
document.body.appendChild(p);
};
/**
* Sort of a mock to build up your BLOB (fake here of-course)
*
* Post the data needed for the FormData() to the worker to handle.
*/
function buildDetails() {
let filename = 'subject1234';
let blob = new Blob(['1234']);
///// STEP 3: Send a message to the worker with file details
worker.postMessage({
name: "audio_data",
blob: blob,
filename: filename
});
// Decrease the count
uploadCount--;
// if count is zero (== false) stop the fake process
if (!uploadCount) clearInterval(builder);
}
This is the sub-process JavaScript file (worker.js)
// IGNORE the 'fetch_mock.js' import that is only here to avoid having to stand up a server
// FormDataPolyFill.js is needed in browsers that don't yet support FormData() in workers
importScripts('FormDataPolyFill.js', 'fetch_mock.js');
// RXJS provides a full suite of asynchronous capabilities based around Reactive Programming (nothing to do with ReactJS);
// The need for your use case is that there are guarantees that the stream of inputs will all be processed
importScripts('https://cdnjs.cloudflare.com/ajax/libs/rxjs/6.3.3/rxjs.umd.js');
// We create a "Subject" that acts as a vessel for our files to upload
let forms = new rxjs.Subject();
// This says "every time the forms Subject is updated, run the postfile function and send the next item from the stream"
forms.subscribe(postFile);
// Listen for messages from the main process and run doIt each time a message is recieved
onmessage = doIt;
/**
* Takes an event object containing the message
*
* The message is presumably the file details
*/
function doIt(e) {
var fd = new FormData();
// e.data represents our details object with three properties
fd.append(e.data.name, e.data.blob, e.data.filename);
// Now, place this FormData object into our stream of them so it can be processed
forms.next(fd);
}
// Instead of using XHR, this uses the newer fetch() API based upon Promises
// https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API
function postFile(fd) {
// Post the file to the server (This is blocked in fetch_mock.js and doesn't go anywhere)
fetch('fake', {
method: 'post',
body: fd,
})
.then((fd) => {
// After the XHR request is complete, 'Then' post a message back to the main thread (If there is a need);
postMessage("sent: " + JSON.stringify(fd));
});
}
Since this will not run in stackoverflow, I've created a plunker so that you can run this example:
http://plnkr.co/edit/kFY6gcYq627PZOATXOnk
If all this seems complicated, you've presented a complicated problem to solve. :-)
Hope this helps.

Writing and reading to a file using streams

The below code writes to a file whatever I type in the console. It also simultaneously reads from the same file and displays whatever that's in the file.
Everything I type in the console is saved in the file. I manually went and checked it. But whatever I type doesn't get displayed simultaneously.
const fs = require('fs');
const Wstream = fs.createWriteStream('./testfile.txt', {encoding: 'utf8'});
const Rstream = fs.createReadStream('./testfile.txt', {encoding: 'utf8'});
process.stdin.pipe(Wstream);
Rstream.pipe(process.stdout);
Why isn't this the same as the following?
process.stdin.pipe(process.stdout);
The read stream will be closed when the data in ./testfile.txt is fully piped. It will not wait for additional entries or changes.
You can use fs.watch, to listen for file changes or even better use something easier like tail-stream or node-tail.

Categories

Resources