How to best parse a text file using .split in Javascript - javascript

I'm having problems using split to try and parse a text file.
the text file is as shows:
123.0 321.02
342.1 234.03
425.3 326.33
etc. etc.
When I read using a FileReader() and doing a readAsText call on the file, the file appears in a string as such:
"123.0 321.02\r\n342.1 234.03\r\n ..." (How it appears in Firebug)
Currently I'm trying to split it like this:
var reader = FileReader();
reader.readAsText(f);
alert(reader.result);
var readInStrings = reader.result.split(/|\s|\n|\r|/);
but when I do this, the resulting array has values as shown:
["123.0", "321.02", "", "342.1", "234.03", "" etc....]
Can anyone explain to me where the values of {""} in the array are coming from and how to correctly split such a file as to only get the number strings as the values?
Any help would be greatly appreciated, thanks!
Note*: Currently doing this in javascript

This is likely due to splitting on each newline and carriage return character rather than each bundle of such characters. To prevent this issue, you could cluster them in the regular expression such as /\s+/ or something similar.

Related

Invalid JSON characters after processing JSON file in Vue.js

I am building a web-app where I can upload a JSON file, update it, then download it. The output JSON is not valid because some characters changed through the process. I don't know where I'm wrong because even when I only do upload => download without updates the JSON is still not valid...
This is how I read the uploaded JSON:
readFile: function () {
var reader = new FileReader();
reader.onload = function(event) {
this.json = JSON.parse(event.target.result);
}.bind(this);
reader.readAsText(this.file);
}
Then I can edit (or not) the json object. Then I can download it with JSON.stringify(json).
When I try to read or validate the output JSON I get errors signaling invalid characters, for example:
Invalid characters in string. Control characters must be escaped for some lines in my editor.
UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xac in position X: invalid start byte when I try to load it in python with open('output.json') as json_file: data = json.load(json_file)
Does using JSON.parse then JSON.stringify modifies the encoding or structure of the JSON? How can I avoid this effect?
UPDATE:
Original file can have some characters like \u2013, \u2014, \u201d, \u00e7 but those characters are transformed into things like this � or invisible characters in the output JSON, which I guess make it not valid.
Try to add 'UTF-8' as a second parameter to the readAsText function as follows :
reader.readAsText(this.file,'UTF-8');

How to assign large string to a variable without ILLEGAL Token error?

I need to assign a long string (4 pages worth of text) to a variable, so far I've been doing it like this
var myText = "[SOME] Text goes \
.. here ? and 'there' \
is more ( to \
come etc. !)";
slashes at the end need to be added to all of the text, and I can't imagine how long this will take to do manually. Also, I get ILLEGAL error for some reason I don't understand for the first line.
Therefore I wanted to find out the best way to handle this situation. I was looking into solutions of passing in a .txt file, but would rather do it as a really long string (this is not a production app). Also string shown in example is random, showing that there can be a lot of various characters in it that need to be accounted for.
You have to concatenate the string:
var t = ""
+"text line 1"
...
+"text line n"
But I would put the text in a text file and read it using xhr (on client) or io (on server).
You cannot have a multiline string definition in javascript but you have several options :
save your text in a file and read this file from your program
use the multiline npm module which propose a hack to use function comments as multiline string definitions
use ES6 multi-line template strings notation, which have multi-line support - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/template_strings#Multi-line_strings
saving the text in a file would seem to me as the preferred option in your case since the text seem to be very long an potentially coming from an untrusted source. You do not want the pasted text to close the string and start doing innapropriate function calls.

Javascript: Split File into an Array

I have a list of english words in a text file. I would like to create an array from this file separated by words:
var dictionaryWords = ["Apple","Orange","Banana","Strawberry"]
How can I use do this in javascript? and please try to explain in beginner terms since I'm still new to this! Thanks!
If you have the file contents in a variable, you use the split() method to split it into an array:
var dictionary = file_contents.split("\n");
\n is the newline character.
You can use AJAX to read the file from the server into a Javascript variable. There are many AJAX tutorials on the web, I'm not going to try to teach that here.

Convert cryptic string to a readable one with JavaScript (UTF-8)

I found out that when I save this distorted string ("Äußerungen üben") as an ANSI text file, then open it with Firefox and choose in the Firefox menu "Unicode", it turns it into a readable german format ("Äußerungen üben").
The same thing is possible with my text editor (Notepad++).
Is there any way to achieve this with JavaScript? E.g. the following would be nice:
var output = makeReadable("Äußerungen üben");
Unfortunately, I get this kind of distorted strings from an external source which doesn't care about UTF-8 and provides all data as ANSI.
PS: Saving the file as UTF-8 and setting the charset as UTF-8 in the META Tag has no effect.
Edit:
Now I solved it through making a list of all common UTF8/ANSI distortions (more than 1300) and wrote a function replacing all wrong character combinations with the right character. It works fine :-) .
I think the encoding of the "distorted string" in your question got munged further by posting it here. But a quick Google search for "javascript convert from utf-8" returns this blog post as the top hit:
http://ecmanaut.blogspot.com/2006/07/encoding-decoding-utf8-in-javascript.html
So it turns out that encoding and decoding UTF-8 in JavaScript is really easy. This works great for me:
var original = "Äußerungen üben";
var utf8 = unescape(encodeURIComponent(original));
//return utf8; // something like "ÃuÃerungen üben"
var output = decodeURIComponent(escape(utf8));
return output;

Using JavaScript, Can I upload a word file and use .replace then save as new document

Using JavaScript I would like to upload a word document and/or browse for file on local machine and view the contents... I would then like to replace the contents with different text.
Here is a snippet of the text replace I want to use.
<button onclick="myFunction()">Convert</button>
<script>
function myFunction()
{
var str = document.getElementById("source").value;
var res =
str.replace(/a/g, "ა")
.replace(/b/g, "ბ")
.replace(/g/g, "გ")
//+ more letters for entire alphabet
document.getElementById("source").value=res;
}
</script>
What I would like to know is if it's possible to get the contents of a word document file, change all of the letters into Georgian characters (whilst retaining formatting if possible) then to save as a new word document?
For docx you could use DOCX.js https://github.com/stephen-hardy/DOCX.js
If you use a .docx file this should be possible since docx is XML. You might want to use the jQuery XML parser (http://api.jquery.com/jQuery.parseXML/) or get the docs as XML string. With larger documents this might not be the best solution.

Categories

Resources