Removing part of a file in node.js - javascript

Currently, I'm looking at trying to remove part of what is basically a proprietary archive format; in order to support the ability to remove a file, I'm trying to figure out how to remove a segment of the file (given an offset and a length). I see there's plenty of append logic when it comes to the fs module of node, but nothing that seems to "splice" parts of a file.
Is this going to be even possible? Will I have to resort to the less preferred option of writing to an entirely new file instead?

Operation System handles appending to file very quickly, there is no need to rewrite the all file when you open it for appending.
But, if you wish to slice (cut) the middle of the file, it doesn't matter which programing language do you use, you have to read the whole file and save it again.
What you can do is to create a new file, and save to it two slices of the input buffer.
var fs=require('fs')
var buffer=fs.readFileSync('input_file')
fs.writeFileSync("output",buffer.slice(0,20))
fs.appendFileSync("output",buffer.slice(50,100))

Related

Split initial chunks into specific file names

I'm creating a FireFox addon that uses chunking to get individual file sizes down to below the limit of js files within an addon. This works great except for the 'initial' file, which I understand to be the entry point files I've specified. I understand why this is done, but want to be able to somehow define how these entry point files get split such that I can control what they're called. I can then add references to these wherever needed elsewhere in my extension.
How can I control chunking of the initial files, ideally specifying how many files to split it into and what the names are. Or at least having predictable names?

Convert a javascript object to CSV file

I have a script which reads a file line by line, generate an object with some fields from certain lines and now I want to put that generated object into a CSV file.
How can I do the following:
From the script itself generate a CSV file
Give initial fields (headers) to the file
Update that file line by line (add to the file one line at a time)
Some clarifications, I don't know the size of the CSV in advance, so the file must by dynamically changed.
Thanks in advance.
Looking at what you have said:
From the script itself generate a csv file
Have a look at node-csv-generate which lets you generate csv strings easily
Give initial fields (headers) to the file & 3. Update that file line by line (add to the file one line at a time)
Check out the node-csv-generate stream functionality to write individually line by line (i.e. inital headers first)
Now since you said you need to run it locally, I would recommend Rhino if just using JS but if node.js is required then check out Rhinodo. These will let you run the program locally on the JVM basically (you could call the JS from within Java if you wanted to).
To export the CSV file there are plenty examples online this SO thread being one... i.e.
var encodedUri = encodeURI(csvContent);
window.open(encodedUri);
Where csvContent is the complete string of your csv. I am not sure how supported this is on Rhinodo, but I'm pretty sure it'll all work on Rhino.
If this is intended to be a purely desktop based application, I would look at using Java (or your preferred language Python or C# might be nicer depending on what you are used to :-) ) rather than JS if everything needs to be local and it intends on being widely used. That way you have a much cleaner interaction with the OS and a lot more control.
I hope this helps!

Is it possible to write protect old data of JSON Files and only enable appending?

I need to store some date stamped data in a JSON file. It is a sensor output. Each day the same JSON file is updated with the additional data. Now, is it possible to put some write protection on already available data to ensure that only new lines could be added to the document and no manual tampering should occur with it?
I suspect that creating checksums after every update may help, but I am not sure how do I implement it? I mean if some part of JSON file is editable then probably checksum is also editable.
Any other way for history protection?
Write protection normally only exists for complete files. So you could revoke write permissions for the file, but then also appending isn't possible anymore.
For ensuring that no tampering has taken place, the standard way would be to cryptographically sign the data. You can do this like this, in principle:
Take the contents of the file.
Add a secret key (any arbitrary string or random characters will do, the longer the better) to this string.
Create a cryptographical checksum (SHA256 hash or similar).
Append this hash to the file. (Newlines before and after.)
You can do this again every time you append something to the file. Because nobody except you knows your secret key, nobody except you will be able to produce the correct hash codes of the part of the file above the hash code.
This will not prevent tampering but it will be detectable.
This is relatively easily done using shell utilities like sha256sum for mere text files. But you have a JSON structure in a file. This is a complex case because the position in the file does not correlate with the age of the data anymore (unlike in a text file which is only being appended to).
To still achieve what you want you need to have an age information on the data. Do you have this? If you provide the JSON structure as #Rohit asked for we might be able to give more detailed advice.

Fastest way to read in very long file starting on arbitary line X

I have a text file which is written to by a python program, and then read in by another program for display on a web browser. Currently it is read in by JavaScript, but I will probably move this functionality to python, and have the results passed into javascript using an ajax Request.
The file is irregularly updated every now and then, sometimes appending one line, sometimes as many as ten. I then need to get an updated copy of the file to javascript for display in the web browser. The file may grow to as large as 100,000 lines. New data is always added to the end of the file.
As it is currently written, javascript checks the length of the file once per second, and if the file is longer than it was last time it was read in, it reads it in again, starting from the beginning, this quickly becomes unwieldy for files of 10,000+ lines. Doubly so since the program may sometimes need to update the file every single second.
What is the fastest/most efficient way to get the data displayed to the front end in javascript?
I am thinking I could:
Keep track of how many lines the file was before, and only read in from that point in the file next time.
Have one program pass the data directly to the other without it reading an intermediate file (although the file must still be written to as a permanent log for later access)
Are there specific benefits/problems with each of these approaches? How would I best implement them?
For Approach #1, I would rather not do file.next() 15,000 times in a for loop to get to where I want to start reading the file, is there a better way?
For Approach #2, Since I need to write to the file no matter what, am I saving much processing time by not reading it too?
Perhaps there are other approaches I have not considered?
Summary: The program needs to display in a web browser data from python that is constantly being updated and may grow as long as 100k lines. Since I am checking for updates every 1 second, It needs to be efficient, just in case it has to do a lot of updates in a row.
The function you seek is seek. From the docs:
f.seek(offset, from_what)
The position is computed from adding offset to a reference point; the reference point is selected by the from_what argument. A from_what value of 0 measures from the beginning of the file, 1 uses the current file position, and 2 uses the end of the file as the reference point. from_what can be omitted and defaults to 0, using the beginning of the file as the reference point.
Limitation for Python 3:
In text files (those opened without a b in the mode string), only seeks relative to the beginning of the file are allowed (the exception being seeking to the very file end with seek(0, 2)) and the only valid offset values are those returned from the f.tell(), or zero. Any other offset value produces undefined behaviour.
Note that seeking to a specific line is tricky, since lines can be variable length. Instead, take note of the current position in the file (f.tell()), and seek back to that.
Opening a large file and reading the last part is simple and quick: Open the file, seek to a suitable point near the end, read from there. But you need to know what you want to read. You can easily do it if you know how many bytes you want to read and display, so keeping track of the previous file size will work well without keeping the file open.
If you have recorded the previous size (in bytes), read the new content like this.
fp = open("logfile.txt", "rb")
fp.seek(old_size, 0)
new_content = fp.read() # Read everything past the current point
On Python 3, this will read bytes which must be converted to str. If the file's encoding is latin1, it would go like this:
new_content = str(new_content, encoding="latin1")
print(new_content)
You should then update old_size and save the value in persistent storage for the next round. You don't say how you record context, so I won't suggest a way.
If you can keep the file open continuously in a server process, go ahead and do it the tail -f way, as in the question that #MarcJ linked to.

How can I save a very large in-memory object to file?

I have a very large array with thousands of items
I tried this solution:
Create a file in memory for user to download, not through server
of creating an anchor
text file
~~JSON.stringify on the array caused the tab to freeze~~ Correction: Trying to log out the result caused the tab to freeze, stringify by itself works fine
The data was originally in string form but creating an anchor with that data resulted in a no-op, I'm assuming also because the data was too big, because using dummy data successfully resulted in a file download being triggered
How can I get this item onto my filesystem?
edit/clarification:
There is a very large array that I can only access via the the browser inspector/console. I can't access it via any other language
Javascript does not allow you to read or write files, except for cookies, and I think the amount of data you are using exceeds the size limit for cookies. This is for security reasons.
However languages such as php, python and ruby allow the reading and writing of files. It appears you are using binary data, so use binary files and write functions.
As to the choice of language : if you already know one use that, or whichever you can get help with. Writing a file is a very basic operation and all three languages are equally good. If you don't know any of these languages you can literally copy and paste the code from their websites.

Categories

Resources