How to read data From *.CSV file using JavaScript? - javascript

My CSV data looks like this:
heading1,heading2,heading3,heading4,heading5
value1_1,value2_1,value3_1,value4_1,value5_1
value1_2,value2_2,value3_2,value4_2,value5_2
...
How do you read this data and convert to an array like this using JavaScript?:
[
heading1: value1_1,
heading2: value2_1,
heading3: value3_1,
heading4: value4_1
heading5: value5_1
],[
heading1: value1_2,
heading2: value2_2,
heading3: value3_2,
heading4: value4_2,
heading5: value5_2
]
....
I've tried this code but no luck!:
<script type="text/javascript">
var allText =[];
var allTextLines = [];
var Lines = [];
var txtFile = new XMLHttpRequest();
txtFile.open("GET", "file://d:/data.txt", true);
txtFile.onreadystatechange = function()
{
allText = txtFile.responseText;
allTextLines = allText.split(/\r\n|\n/);
};
document.write(allTextLines);
document.write(allText);
document.write(txtFile);
</script>

No need to write your own...
The jQuery-CSV library has a function called $.csv.toObjects(csv) that does the mapping automatically.
Note: The library is designed to handle any CSV data that is RFC 4180 compliant, including all of the nasty edge cases that most 'simple' solutions overlook.
Like #Blazemonger already stated, first you need to add line breaks to make the data valid CSV.
Using the following dataset:
heading1,heading2,heading3,heading4,heading5
value1_1,value2_1,value3_1,value4_1,value5_1
value1_2,value2_2,value3_2,value4_2,value5_2
Use the code:
var data = $.csv.toObjects(csv):
The output saved in 'data' will be:
[
{ heading1:"value1_1",heading2:"value2_1",heading3:"value3_1",heading4:"value4_1",heading5:"value5_1" }
{ heading1:"value1_2",heading2:"value2_2",heading3:"value3_2",heading4:"value4_2",heading5:"value5_2" }
]
Note: Technically, the way you wrote the key-value mapping is invalid JavaScript. The objects containing the key-value pairs should be wrapped in brackets.
If you want to try it out for yourself, I suggest you take a look at the Basic Usage Demonstration under the 'toObjects()' tab.
Disclaimer: I'm the original author of jQuery-CSV.
Update:
Edited to use the dataset that the op provided and included a link to the demo where the data can be tested for validity.
Update2:
Due to the shuttering of Google Code. jquery-csv has moved to GitHub

NOTE: I concocted this solution before I was reminded about all the "special cases" that can occur in a valid CSV file, like escaped quotes. I'm leaving my answer for those who want something quick and dirty, but I recommend Evan's answer for accuracy.
This code will work when your data.txt file is one long string of comma-separated entries, with no newlines:
data.txt:
heading1,heading2,heading3,heading4,heading5,value1_1,...,value5_2
javascript:
$(document).ready(function() {
$.ajax({
type: "GET",
url: "data.txt",
dataType: "text",
success: function(data) {processData(data);}
});
});
function processData(allText) {
var record_num = 5; // or however many elements there are in each row
var allTextLines = allText.split(/\r\n|\n/);
var entries = allTextLines[0].split(',');
var lines = [];
var headings = entries.splice(0,record_num);
while (entries.length>0) {
var tarr = [];
for (var j=0; j<record_num; j++) {
tarr.push(headings[j]+":"+entries.shift());
}
lines.push(tarr);
}
// alert(lines);
}
The following code will work on a "true" CSV file with linebreaks between each set of records:
data.txt:
heading1,heading2,heading3,heading4,heading5
value1_1,value2_1,value3_1,value4_1,value5_1
value1_2,value2_2,value3_2,value4_2,value5_2
javascript:
$(document).ready(function() {
$.ajax({
type: "GET",
url: "data.txt",
dataType: "text",
success: function(data) {processData(data);}
});
});
function processData(allText) {
var allTextLines = allText.split(/\r\n|\n/);
var headers = allTextLines[0].split(',');
var lines = [];
for (var i=1; i<allTextLines.length; i++) {
var data = allTextLines[i].split(',');
if (data.length == headers.length) {
var tarr = [];
for (var j=0; j<headers.length; j++) {
tarr.push(headers[j]+":"+data[j]);
}
lines.push(tarr);
}
}
// alert(lines);
}
http://jsfiddle.net/mblase75/dcqxr/

Don't split on commas -- it won't work for most CSV files, and this question has wayyyy too many views for the asker's kind of input data to apply to everyone. Parsing CSV is kind of scary since there's no truly official standard, and lots of delimited text writers don't consider edge cases.
This question is old, but I believe there's a better solution now that Papa Parse is available. It's a library I wrote, with help from contributors, that parses CSV text or files. It's the only JS library I know of that supports files gigabytes in size. It also handles malformed input gracefully.
1 GB file parsed in 1 minute:
(Update: With Papa Parse 4, the same file took only about 30 seconds in Firefox. Papa Parse 4 is now the fastest known CSV parser for the browser.)
Parsing text is very easy:
var data = Papa.parse(csvString);
Parsing files is also easy:
Papa.parse(file, {
complete: function(results) {
console.log(results);
}
});
Streaming files is similar (here's an example that streams a remote file):
Papa.parse("http://example.com/bigfoo.csv", {
download: true,
step: function(row) {
console.log("Row:", row.data);
},
complete: function() {
console.log("All done!");
}
});
If your web page locks up during parsing, Papa can use web workers to keep your web site reactive.
Papa can auto-detect delimiters and match values up with header columns, if a header row is present. It can also turn numeric values into actual number types. It appropriately parses line breaks and quotes and other weird situations, and even handles malformed input as robustly as possible. I've drawn on inspiration from existing libraries to make Papa, so props to other JS implementations.

I am using d3.js for parsing csv file. Very easy to use.
Here is the docs.
Steps:
npm install d3-request
Using Es6;
import { csv } from 'd3-request';
import url from 'path/to/data.csv';
csv(url, function(err, data) {
console.log(data);
})
Please see docs for more.
Update -
d3-request is deprecated. you can use d3-fetch

Here's a JavaScript function that parses CSV data, accounting for commas found inside quotes.
// Parse a CSV row, accounting for commas inside quotes
function parse(row){
var insideQuote = false,
entries = [],
entry = [];
row.split('').forEach(function (character) {
if(character === '"') {
insideQuote = !insideQuote;
} else {
if(character == "," && !insideQuote) {
entries.push(entry.join(''));
entry = [];
} else {
entry.push(character);
}
}
});
entries.push(entry.join(''));
return entries;
}
Example use of the function to parse a CSV file that looks like this:
"foo, the column",bar
2,3
"4, the value",5
into arrays:
// csv could contain the content read from a csv file
var csv = '"foo, the column",bar\n2,3\n"4, the value",5',
// Split the input into lines
lines = csv.split('\n'),
// Extract column names from the first line
columnNamesLine = lines[0],
columnNames = parse(columnNamesLine),
// Extract data from subsequent lines
dataLines = lines.slice(1),
data = dataLines.map(parse);
// Prints ["foo, the column","bar"]
console.log(JSON.stringify(columnNames));
// Prints [["2","3"],["4, the value","5"]]
console.log(JSON.stringify(data));
Here's how you can transform the data into objects, like D3's csv parser (which is a solid third party solution):
var dataObjects = data.map(function (arr) {
var dataObject = {};
columnNames.forEach(function(columnName, i){
dataObject[columnName] = arr[i];
});
return dataObject;
});
// Prints [{"foo":"2","bar":"3"},{"foo":"4","bar":"5"}]
console.log(JSON.stringify(dataObjects));
Here's a working fiddle of this code.
Enjoy! --Curran

You can use PapaParse to help.
https://www.papaparse.com/
Here is a CodePen.
https://codepen.io/sandro-wiggers/pen/VxrxNJ
Papa.parse(e, {
header:true,
before: function(file, inputElem){ console.log('Attempting to Parse...')},
error: function(err, file, inputElem, reason){ console.log(err); },
complete: function(results, file){ $.PAYLOAD = results; }
});

If you want to solve this without using Ajax, use the FileReader() Web API.
Example implementation:
Select .csv file
See output
function readSingleFile(e) {
var file = e.target.files[0];
if (!file) {
return;
}
var reader = new FileReader();
reader.onload = function(e) {
var contents = e.target.result;
displayContents(contents);
displayParsed(contents);
};
reader.readAsText(file);
}
function displayContents(contents) {
var element = document.getElementById('file-content');
element.textContent = contents;
}
function displayParsed(contents) {
const element = document.getElementById('file-parsed');
const json = contents.split(',');
element.textContent = JSON.stringify(json);
}
document.getElementById('file-input').addEventListener('change', readSingleFile, false);
<input type="file" id="file-input" />
<h3>Raw contents of the file:</h3>
<pre id="file-content">No data yet.</pre>
<h3>Parsed file contents:</h3>
<pre id="file-parsed">No data yet.</pre>

function CSVParse(csvFile)
{
this.rows = [];
var fieldRegEx = new RegExp('(?:\s*"((?:""|[^"])*)"\s*|\s*((?:""|[^",\r\n])*(?:""|[^"\s,\r\n]))?\s*)(,|[\r\n]+|$)', "g");
var row = [];
var currMatch = null;
while (currMatch = fieldRegEx.exec(this.csvFile))
{
row.push([currMatch[1], currMatch[2]].join('')); // concatenate with potential nulls
if (currMatch[3] != ',')
{
this.rows.push(row);
row = [];
}
if (currMatch[3].length == 0)
break;
}
}
I like to have the regex do as much as possible. This regex treats all items as either quoted or unquoted, followed by either a column delimiter, or a row delimiter. Or the end of text.
Which is why that last condition -- without it it would be an infinite loop since the pattern can match a zero length field (totally valid in csv). But since $ is a zero length assertion, it won't progress to a non match and end the loop.
And FYI, I had to make the second alternative exclude quotes surrounding the value; seems like it was executing before the first alternative on my javascript engine and considering the quotes as part of the unquoted value. I won't ask -- just got it to work.

Per the accepted answer,
I got this to work by changing the 1 to a 0 here:
for (var i=1; i<allTextLines.length; i++) {
changed to
for (var i=0; i<allTextLines.length; i++) {
It will compute the a file with one continuous line as having an allTextLines.length of 1. So if the loop starts at 1 and runs as long as it's less than 1, it never runs. Hence the blank alert box.

$(function() {
$("#upload").bind("click", function() {
var regex = /^([a-zA-Z0-9\s_\\.\-:])+(.csv|.xlsx)$/;
if (regex.test($("#fileUpload").val().toLowerCase())) {
if (typeof(FileReader) != "undefined") {
var reader = new FileReader();
reader.onload = function(e) {
var customers = new Array();
var rows = e.target.result.split("\r\n");
for (var i = 0; i < rows.length - 1; i++) {
var cells = rows[i].split(",");
if (cells[0] == "" || cells[0] == undefined) {
var s = customers[customers.length - 1];
s.Ord.push(cells[2]);
} else {
var dt = customers.find(x => x.Number === cells[0]);
if (dt == undefined) {
if (cells.length > 1) {
var customer = {};
customer.Number = cells[0];
customer.Name = cells[1];
customer.Ord = new Array();
customer.Ord.push(cells[2]);
customer.Point_ID = cells[3];
customer.Point_Name = cells[4];
customer.Point_Type = cells[5];
customer.Set_ORD = cells[6];
customers.push(customer);
}
} else {
var dtt = dt;
dtt.Ord.push(cells[2]);
}
}
}

Actually you can use a light-weight library called any-text.
install dependencies
npm i -D any-text
use custom command to read files
var reader = require('any-text');
reader.getText(`path-to-file`).then(function (data) {
console.log(data);
});
or use async-await :
var reader = require('any-text');
const chai = require('chai');
const expect = chai.expect;
describe('file reader checks', () => {
it('check csv file content', async () => {
expect(
await reader.getText(`${process.cwd()}/test/files/dummy.csv`)
).to.contains('Lorem ipsum');
});
});

This is an old question and in 2022 there are many ways to achieve this. First, I think D3 is one of the best alternatives for data manipulation. It's open sourced and free to use, but also it's modular so we can import just the fetch module.
Here is a basic example. We will use the legacy mode so I will import the entire D3 library. Now, let's call d3.csv function and it's done. This function internally calls the fetch method therefore, it can open dataURL, url, files, blob, and so on.
const fileInput = document.getElementById('csv')
const outElement = document.getElementById('out')
const previewCSVData = async dataurl => {
const d = await d3.csv(dataurl)
console.log({
d
})
outElement.textContent = d.columns
}
const readFile = e => {
const file = fileInput.files[0]
const reader = new FileReader()
reader.onload = () => {
const dataUrl = reader.result;
previewCSVData(dataUrl)
}
reader.readAsDataURL(file)
}
fileInput.onchange = readFile
<script type="text/javascript" src="https://unpkg.com/d3#7.6.1/dist/d3.min.js"></script>
<div>
<p>Select local CSV File:</p>
<input id="csv" type="file" accept=".csv">
</div>
<pre id="out"><p>File headers will appear here</p></pre>
If we don't want to use any library and we just want to use pain JavaScrip (Vanilla JS) and we managed to get the text content of a file as data and we don't want to use d3 we can implement a simple function that will split the data into a text array then we will extract the first line and split into a headers array and the rest of the text will be the lines we will process. After, we map each line and extract its values and create a row object from an array created from mapping each header to its correspondent value from values[index].
NOTE:
We also we going to use a little trick array objects in JavaScript can also have attributes. Yes so we will define an attribute rows.headers and assign the headers to it.
const data = `heading_1,heading_2,heading_3,heading_4,heading_5
value_1_1,value_2_1,value_3_1,value_4_1,value_5_1
value_1_2,value_2_2,value_3_2,value_4_2,value_5_2
value_1_3,value_2_3,value_3_3,value_4_3,value_5_3`
const csvParser = data => {
const text = data.split(/\r\n|\n/)
const [first, ...lines] = text
const headers = first.split(',')
const rows = []
rows.headers = headers
lines.map(line => {
const values = line.split(',')
const row = Object.fromEntries(headers.map((header, i) => [header, values[i]]))
rows.push(row)
})
return rows
}
const d = csvParser(data)
// Accessing to the theaders attribute
const headers = d.headers
console.log({headers})
console.log({d})
Finally, let's implement a vanilla JS file loader using fetch and parsing the csv file.
const fetchFile = async dataURL => {
return await fetch(dataURL).then(response => response.text())
}
const csvParser = data => {
const text = data.split(/\r\n|\n/)
const [first, ...lines] = text
const headers = first.split(',')
const rows = []
rows.headers = headers
lines.map(line => {
const values = line.split(',')
const row = Object.fromEntries(headers.map((header, i) => [header, values[i]]))
rows.push(row)
})
return rows
}
const fileInput = document.getElementById('csv')
const outElement = document.getElementById('out')
const previewCSVData = async dataURL => {
const data = await fetchFile(dataURL)
const d = csvParser(data)
console.log({ d })
outElement.textContent = d.headers
}
const readFile = e => {
const file = fileInput.files[0]
const reader = new FileReader()
reader.onload = () => {
const dataURL = reader.result;
previewCSVData(dataURL)
}
reader.readAsDataURL(file)
}
fileInput.onchange = readFile
<script type="text/javascript" src="https://unpkg.com/d3#7.6.1/dist/d3.min.js"></script>
<div>
<p>Select local CSV File:</p>
<input id="csv" type="file" accept=".csv">
</div>
<pre id="out"><p>File contents will appear here</p></pre>
I used this file to test it

Here is another way to read an external CSV into Javascript (using jQuery).
It's a little bit more long winded, but I feel by reading the data into arrays you can exactly follow the process and makes for easy troubleshooting.
Might help someone else.
The data file example:
Time,data1,data2,data2
08/11/2015 07:30:16,602,0.009,321
And here is the code:
$(document).ready(function() {
// AJAX in the data file
$.ajax({
type: "GET",
url: "data.csv",
dataType: "text",
success: function(data) {processData(data);}
});
// Let's process the data from the data file
function processData(data) {
var lines = data.split(/\r\n|\n/);
//Set up the data arrays
var time = [];
var data1 = [];
var data2 = [];
var data3 = [];
var headings = lines[0].split(','); // Splice up the first row to get the headings
for (var j=1; j<lines.length; j++) {
var values = lines[j].split(','); // Split up the comma seperated values
// We read the key,1st, 2nd and 3rd rows
time.push(values[0]); // Read in as string
// Recommended to read in as float, since we'll be doing some operations on this later.
data1.push(parseFloat(values[1]));
data2.push(parseFloat(values[2]));
data3.push(parseFloat(values[3]));
}
// For display
var x= 0;
console.log(headings[0]+" : "+time[x]+headings[1]+" : "+data1[x]+headings[2]+" : "+data2[x]+headings[4]+" : "+data2[x]);
}
})
Hope this helps someone in the future!

A bit late but I hope it helps someone.
Some time ago even I faced a problem where the string data contained \n in between and while reading the file it used to read as different lines.
Eg.
"Harry\nPotter","21","Gryffindor"
While-Reading:
Harry
Potter,21,Gryffindor
I had used a library csvtojson in my angular project to solve this problem.
You can read the CSV file as a string using the following code and then pass that string to the csvtojson library and it will give you a list of JSON.
Sample Code:
const csv = require('csvtojson');
if (files && files.length > 0) {
const file: File = files.item(0);
const reader: FileReader = new FileReader();
reader.readAsText(file);
reader.onload = (e) => {
const csvs: string = reader.result as string;
csv({
output: "json",
noheader: false
}).fromString(csvs)
.preFileLine((fileLine, idx) => {
//Convert csv header row to lowercase before parse csv file to json
if (idx === 0) { return fileLine.toLowerCase() }
return fileLine;
})
.then((result) => {
// list of json in result
});
}
}

I use the jquery-csv to do this.
and I provide two examples as below
async function ReadFile(file) {
return await file.text()
}
function removeExtraSpace(stringData) {
stringData = stringData.replace(/,( *)/gm, ",") // remove extra space
stringData = stringData.replace(/^ *| *$/gm, "") // remove space on the beginning and end.
return stringData
}
function simpleTest() {
let data = `Name, Age, msg
foo, 25, hello world
bar, 18, "!! 🐬 !!"
`
data = removeExtraSpace(data)
console.log(data)
const options = {
separator: ",", // default "," . (You may want to Tab "\t" or somethings.
delimiter: '"', // default "
headers: true // default true
}
// const myObj = $.csv.toObjects(data, options)
const myObj = $.csv.toObjects(data) // If you want to use default options, then you can omit them.
console.log(myObj)
}
window.onload = () => {
const inputFile = document.getElementById("uploadFile")
inputFile.onchange = () => {
const inputValue = inputFile.value
if (inputValue === "") {
return
}
const selectedFile = document.getElementById('uploadFile').files[0]
const promise = new Promise(resolve => {
const fileContent = ReadFile(selectedFile)
resolve(fileContent)
})
promise.then(fileContent => {
// Use promise to wait for the file reading to finish.
console.log(fileContent)
fileContent = removeExtraSpace(fileContent)
const myObj = $.csv.toObjects(fileContent)
console.log(myObj)
})
}
}
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.6.0/jquery.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery-csv/1.0.11/jquery.csv.min.js"></script>
<label for="uploadFile">Demo 1</label>
<input type="file" id="uploadFile" accept=".csv"/>
<button onclick="simpleTest()">Demo 2</button>

With this function csvToObjs you can transform data-entries from format CSV to an array of objects.
function csvToObjs(string) {
const lines = data.split(/\r\n|\n/);
let [headings, ...entries] = lines;
headings = headings.split(',');
const objs = [];
entries.map(entry=>{
obj = entry.split(',');
objs.push(Object.fromEntries(headings.map((head, i)=>[head, obj[i]])));
})
return objs;
}
data = `heading1,heading2,heading3,heading4,heading5
value1_1,value2_1,value3_1,value4_1,value5_1
value1_2,value2_2,value3_2,value4_2,value5_2`
console.log(csvToObjs(data));

Related

Read a csv or excel (xlsx) file with just javascript and html?

Is it possible to read a excel xlsx or csv, preferably xlsx, using just JavaScript and html. All the solutions (sheetsJS, d3{d3 uses the Fetch API}) I have found require a webserver. I understand I can get a simple webserver using web server for chrome or python or node.js. Futhermore, I understand I can run chrome with certain flags, but I would like to not do this because of security concerns. I am building a demo for someone who is not web savvy and would like to avoid doing this.
my file structure is very simple :
TestFolder
| index.html
| js/
| test.js
| data/
| test.xlsx
| css/
| test.css
I simply need to read the xlsx and then display that data in html page.
I've added a simple example that accepts Excel or CSV files (current example accepts a single file), uses the SheetJS library to parse the Excel file type, convert the data to JSON and logs the contents to the console.
This should be more than enough to complete your demo. Hope this helps!
var file = document.getElementById('docpicker')
var viewer = document.getElementById('dataviewer')
file.addEventListener('change', importFile);
function importFile(evt) {
var f = evt.target.files[0];
if (f) {
var r = new FileReader();
r.onload = e => {
var contents = processExcel(e.target.result);
console.log(contents)
}
r.readAsBinaryString(f);
} else {
console.log("Failed to load file");
}
}
function processExcel(data) {
var workbook = XLSX.read(data, {
type: 'binary'
});
var firstSheet = workbook.SheetNames[0];
var data = to_json(workbook);
return data
};
function to_json(workbook) {
var result = {};
workbook.SheetNames.forEach(function(sheetName) {
var roa = XLSX.utils.sheet_to_json(workbook.Sheets[sheetName], {
header: 1
});
if (roa.length) result[sheetName] = roa;
});
return JSON.stringify(result, 2, 2);
};
<script src="https://cdnjs.cloudflare.com/ajax/libs/xlsx/0.14.3/xlsx.full.min.js"></script>
<label for="avatar">Choose an Excel or CSV file:</label>
<input type="file" id="docpicker" accept=".csv,application/vnd.ms-excel,.xlt,application/vnd.ms-excel,.xla,application/vnd.ms-excel,.xlsx,application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,.xltx,application/vnd.openxmlformats-officedocument.spreadsheetml.template,.xlsm,application/vnd.ms-excel.sheet.macroEnabled.12,.xltm,application/vnd.ms-excel.template.macroEnabled.12,.xlam,application/vnd.ms-excel.addin.macroEnabled.12,.xlsb,application/vnd.ms-excel.sheet.binary.macroEnabled.12">
<div id="dataviewer">
You could try using the Fetch API to download the file and process it with JavaScript.
fetch('data/test.xlsx').then(function(resp) {
// Process the data here...
console.log('Data Response: ', resp);
});
It would be much easier to work with if your data file was in JSON format, but this might work for your needs.
Update - Example when the data is in JSON format
fetch('data/test.xlsx').then(function(resp) {
var records = resp.json(); // Assuming that we receive a JSON array.
console.log('Records: ', records.length);
records.forEach(function(record){
console.log('Record Name: ', record.name); // Assuming each record has a name property
});
});
Here is how I ended up doing it:
I got error w/ readAsBinaryString so I went out w/ the below. I noted that sheet_to_json didn't work w/ csv so I ran that first and checked results and parsed sheet_to_csv if sheet_to_json === 0.
HTML:
<!-- SheetsJS CSV & XLSX -->
<script src="xlsx/xlsx.full.min.js"></script>
<!-- SheetsJS CSV & XLSX -->
<!-- CSV/XLSX -->
<div class="ms-font-xl ms-settings__content__subtitle">CSV/XLSX Upload:</div>
<input type="file" id="csv-xlsx-file" accept=".csv,application/vnd.ms-excel,.xlt,application/vnd.ms-excel,.xla,application/vnd.ms-excel,.xlsx,application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,.xltx,application/vnd.openxmlformats-officedocument.spreadsheetml.template,.xlsm,application/vnd.ms-excel.sheet.macroEnabled.12,.xltm,application/vnd.ms-excel.template.macroEnabled.12,.xlam,application/vnd.ms-excel.addin.macroEnabled.12,.xlsb,application/vnd.ms-excel.sheet.binary.macroEnabled.12">
<!-- CSV/XLSX -->
JS:
var csv_file_elm = document.getElementById("csv-xlsx-file")
csv_file_elm.addEventListener('change', CSV_XLSX_File_Selected_Event)
async function CSV_XLSX_File_Selected_Event() {
var id = this.id
var inputElement = document.getElementById(id)
let ext = inputElement.value
ext = ext.split(".")
ext = ext[ext.length - 1]
var files = inputElement.files || [];
if (!files.length) return;
var file = files[0];
var reader = new FileReader();
reader.onloadend = async function (event) {
var arrayBuffer = reader.result;
var options = { type: 'array' };
var workbook = XLSX.read(arrayBuffer, options);
//console.timeEnd();
var sheetName = workbook.SheetNames
var sheet = workbook.Sheets[sheetName]
var sheet_to_html = XLSX.utils.sheet_to_html(sheet)
var sheet_to_json = XLSX.utils.sheet_to_json(sheet)
if (sheet_to_json.length === 0) {
var sheet_to_csv = [XLSX.utils.sheet_to_csv(sheet)]
var results = sheet_to_csv
}
if (sheet_to_json.length > 0) {
var results = sheet_to_json
}
let Parsed_File_Obj = {
"sheet_to_html": sheet_to_html,
"results": results,
"ext": ext,
}
console.log('Parsed_File_Obj')
console.log(Parsed_File_Obj)
};
reader.readAsArrayBuffer(file);
}

Export pdf UTF-8 encoding

I have some trouble with exporting data to pdf with arabic characters
, the encoding is incorrect
I tried jsPDF with jsPDF-AutoTable plugin, and i tried pdfMake But the problem still exists
In addition, I am using ASP.Net Boilerplate v3.2.4 as a backend, Angularjs v1.7.5
i want to export ui-grid data to PDF
Here look at my angularjs code:
vm.exportPdf = function () {
var columns = [];
var rows = [];
// copy ui-grid's titles to pdf's table definition:
var allColumnDefs = vm.gridOptions.columnDefs;
for (var columnIdx in allColumnDefs) {
var columnDef = allColumnDefs[columnIdx];
if (columnDef.name !== 'actions') {
var newColumnDef = {
title: columnDef.displayName,
dataKey: columnDef.name
};
columns.push(newColumnDef);
}
}
// copy ui-grid's actual data to pdf's table:
var allRecords = vm.gridOptions.data;
for (var recordIdx in allRecords) {
var record = allRecords[recordIdx];
var newRow = {};
for (var columnIdx1 in allColumnDefs) {
var columnDef1 = allColumnDefs[columnIdx1];
var value = record[columnDef1.name];
if (value !== null) {
newRow[columnDef1.name] = value;
}
}
rows.push(newRow);
}
var docName = 'myFile.pdf';
var doc = new jsPDF('p', 'pt');
doc.autoTable(columns, rows, { styles: { fontSize: 8.5 } });
doc.save(docName);
};
OUTPUT arabic characters look like this:
ΓΎΓΏ41C) 'DG1E DD-H'D'
so my question here is whether anyone has experienced this problem and what is the solution for it.
or are there other libraries or plugins to exporting to PDF but support UTF-8 and arabic characters.
Thanks.

How do I read a large csv file using exceljs

I'm using exceljs with csv and excel files in my project. Recently the files become really large and I am having trouble to read them as normal.
The following code works good for small files however it can't read Million rows.
I realized I need to use stream but I don't understand how it works. I would like to use it for the same case just to be able to read large. Can someone help please?
var Excel = require('exceljs');
var workbook = new Excel.Workbook();
readSecurityCSV(workbook);
var header = {}
var adSecurities = []
function readSecurityCSV(workbook){
workbook.csv.readFile('./csv/clientsOrders.csv')
.then(worksheet => {
worksheet.eachRow({ includeEmpty: true }, function(row, rowNumber) {
if(rowNumber == 1){
header = {}
row.eachCell({ includeEmpty: true }, function(cell, cellNumber) {
header[cellNumber] = cell
});
}else{
var currentSecurity = {}
row.eachCell({ includeEmpty: true }, function(cell, cellNumber) {
currentSecurity[header[cellNumber].value] = cell.value
});
currentSecurity.rowNumber = rowNumber
adSecurities.push(currentSecurity)
}
});
console.log(color.blue("adSecurity"), adSecurities[0]);
})
}
you can split csv file then loop through it.
the splitting can be done like this:
var chunkSize = 1024 * 1024;
var fileSize = file.size;
var chunks = Math.ceil(file.size/chunkSize,chunkSize);
var chunk = 0;
while (chunk <= chunks) {
var offset = chunk*chunkSize;
console.log(file.slice(offset,offset + chunkSize));
chunk++;
}
if you want to work with multiple promises in paralel, check this link [1]

Getting byte array through input type = file

var profileImage = fileInputInByteArray;
$.ajax({
url: 'abc.com/',
type: 'POST',
dataType: 'json',
data: {
// Other data
ProfileImage: profileimage
// Other data
},
success: {
}
})
// Code in WebAPI
[HttpPost]
public HttpResponseMessage UpdateProfile([FromUri]UpdateProfileModel response) {
//...
return response;
}
public class UpdateProfileModel {
// ...
public byte[] ProfileImage {get ;set; }
// ...
}
<input type="file" id="inputFile" />
I am using ajax call to post byte[] value of a input type = file input to web api which receives in byte[] format. However, I am experiencing difficulty of getting byte array. I am expecting that we can get the byte array through File API.
Note: I need to store the byte array in a variable first before passing through ajax call
[Edit]
As noted in comments above, while still on some UA implementations, readAsBinaryString method didn't made its way to the specs and should not be used in production.
Instead, use readAsArrayBuffer and loop through it's buffer to get back the binary string :
document.querySelector('input').addEventListener('change', function() {
var reader = new FileReader();
reader.onload = function() {
var arrayBuffer = this.result,
array = new Uint8Array(arrayBuffer),
binaryString = String.fromCharCode.apply(null, array);
console.log(binaryString);
}
reader.readAsArrayBuffer(this.files[0]);
}, false);
<input type="file" />
<div id="result"></div>
For a more robust way to convert your arrayBuffer in binary string, you can refer to this answer.
[old answer] (modified)
Yes, the file API does provide a way to convert your File, in the <input type="file"/> to a binary string, thanks to the FileReader Object and its method readAsBinaryString.
[But don't use it in production !]
document.querySelector('input').addEventListener('change', function(){
var reader = new FileReader();
reader.onload = function(){
var binaryString = this.result;
document.querySelector('#result').innerHTML = binaryString;
}
reader.readAsBinaryString(this.files[0]);
}, false);
<input type="file"/>
<div id="result"></div>
If you want an array buffer, then you can use the readAsArrayBuffer() method :
document.querySelector('input').addEventListener('change', function(){
var reader = new FileReader();
reader.onload = function(){
var arrayBuffer = this.result;
console.log(arrayBuffer);
document.querySelector('#result').innerHTML = arrayBuffer + ' '+arrayBuffer.byteLength;
}
reader.readAsArrayBuffer(this.files[0]);
}, false);
<input type="file"/>
<div id="result"></div>
$(document).ready(function(){
(function (document) {
var input = document.getElementById("files"),
output = document.getElementById("result"),
fileData; // We need fileData to be visible to getBuffer.
// Eventhandler for file input.
function openfile(evt) {
var files = input.files;
// Pass the file to the blob, not the input[0].
fileData = new Blob([files[0]]);
// Pass getBuffer to promise.
var promise = new Promise(getBuffer);
// Wait for promise to be resolved, or log error.
promise.then(function(data) {
// Here you can pass the bytes to another function.
output.innerHTML = data.toString();
console.log(data);
}).catch(function(err) {
console.log('Error: ',err);
});
}
/*
Create a function which will be passed to the promise
and resolve it when FileReader has finished loading the file.
*/
function getBuffer(resolve) {
var reader = new FileReader();
reader.readAsArrayBuffer(fileData);
reader.onload = function() {
var arrayBuffer = reader.result
var bytes = new Uint8Array(arrayBuffer);
resolve(bytes);
}
}
// Eventlistener for file input.
input.addEventListener('change', openfile, false);
}(document));
});
<!DOCTYPE html>
<html>
<head>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
</head>
<body>
<input type="file" id="files"/>
<div id="result"></div>
</body>
</html>
Modern browsers now have the arrayBuffer method on Blob's:
document.querySelector('input').addEventListener('change', async (event) => {
const buffer = await event.target.files[0].arrayBuffer()
console.log(buffer)
}, false)
πŸŽ‰ πŸŽ‰
This is a long post, but I was tired of all these examples that weren't working for me because they used Promise objects or an errant this that has a different meaning when you are using Reactjs. My implementation was using a DropZone with reactjs, and I got the bytes using a framework similar to what is posted at this following site, when nothing else above would work: https://www.mokuji.me/article/drop-upload-tutorial-1 . There were 2 keys, for me:
You have to get the bytes from the event object, using and during a FileReader's onload function.
I tried various combinations, but in the end, what worked was:
const bytes = e.target.result.split('base64,')[1];
Where e is the event. React requires const, you could use var in plain Javascript. But that gave me the base64 encoded byte string.
So I'm just going to include the applicable lines for integrating this as if you were using React, because that's how I was building it, but try to also generalize this, and add comments where necessary, to make it applicable to a vanilla Javascript implementation - caveated that I did not use it like that in such a construct to test it.
These would be your bindings at the top, in your constructor, in a React framework (not relevant to a vanilla Javascript implementation):
this.uploadFile = this.uploadFile.bind(this);
this.processFile = this.processFile.bind(this);
this.errorHandler = this.errorHandler.bind(this);
this.progressHandler = this.progressHandler.bind(this);
And you'd have onDrop={this.uploadFile} in your DropZone element. If you were doing this without React, this is the equivalent of adding the onclick event handler you want to run when you click the "Upload File" button.
<button onclick="uploadFile(event);" value="Upload File" />
Then the function (applicable lines... I'll leave out my resetting my upload progress indicator, etc.):
uploadFile(event){
// This is for React, only
this.setState({
files: event,
});
console.log('File count: ' + this.state.files.length);
// You might check that the "event" has a file & assign it like this
// in vanilla Javascript:
// var files = event.target.files;
// if (!files && files.length > 0)
// files = (event.dataTransfer ? event.dataTransfer.files :
// event.originalEvent.dataTransfer.files);
// You cannot use "files" as a variable in React, however:
const in_files = this.state.files;
// iterate, if files length > 0
if (in_files.length > 0) {
for (let i = 0; i < in_files.length; i++) {
// use this, instead, for vanilla JS:
// for (var i = 0; i < files.length; i++) {
const a = i + 1;
console.log('in loop, pass: ' + a);
const f = in_files[i]; // or just files[i] in vanilla JS
const reader = new FileReader();
reader.onerror = this.errorHandler;
reader.onprogress = this.progressHandler;
reader.onload = this.processFile(f);
reader.readAsDataURL(f);
}
}
}
There was this question on that syntax, for vanilla JS, on how to get that file object:
JavaScript/HTML5/jQuery Drag-And-Drop Upload - "Uncaught TypeError: Cannot read property 'files' of undefined"
Note that React's DropZone will already put the File object into this.state.files for you, as long as you add files: [], to your this.state = { .... } in your constructor. I added syntax from an answer on that post on how to get your File object. It should work, or there are other posts there that can help. But all that Q/A told me was how to get the File object, not the blob data, itself. And even if I did fileData = new Blob([files[0]]); like in sebu's answer, which didn't include var with it for some reason, it didn't tell me how to read that blob's contents, and how to do it without a Promise object. So that's where the FileReader came in, though I actually tried and found I couldn't use their readAsArrayBuffer to any avail.
You will have to have the other functions that go along with this construct - one to handle onerror, one for onprogress (both shown farther below), and then the main one, onload, that actually does the work once a method on reader is invoked in that last line. Basically you are passing your event.dataTransfer.files[0] straight into that onload function, from what I can tell.
So the onload method calls my processFile() function (applicable lines, only):
processFile(theFile) {
return function(e) {
const bytes = e.target.result.split('base64,')[1];
}
}
And bytes should have the base64 bytes.
Additional functions:
errorHandler(e){
switch (e.target.error.code) {
case e.target.error.NOT_FOUND_ERR:
alert('File not found.');
break;
case e.target.error.NOT_READABLE_ERR:
alert('File is not readable.');
break;
case e.target.error.ABORT_ERR:
break; // no operation
default:
alert('An error occurred reading this file.');
break;
}
}
progressHandler(e) {
if (e.lengthComputable){
const loaded = Math.round((e.loaded / e.total) * 100);
let zeros = '';
// Percent loaded in string
if (loaded >= 0 && loaded < 10) {
zeros = '00';
}
else if (loaded < 100) {
zeros = '0';
}
// Display progress in 3-digits and increase bar length
document.getElementById("progress").textContent = zeros + loaded.toString();
document.getElementById("progressBar").style.width = loaded + '%';
}
}
And applicable progress indicator markup:
<table id="tblProgress">
<tbody>
<tr>
<td><b><span id="progress">000</span>%</b> <span className="progressBar"><span id="progressBar" /></span></td>
</tr>
</tbody>
</table>
And CSS:
.progressBar {
background-color: rgba(255, 255, 255, .1);
width: 100%;
height: 26px;
}
#progressBar {
background-color: rgba(87, 184, 208, .5);
content: '';
width: 0;
height: 26px;
}
EPILOGUE:
Inside processFile(), for some reason, I couldn't add bytes to a variable I carved out in this.state. So, instead, I set it directly to the variable, attachments, that was in my JSON object, RequestForm - the same object as my this.state was using. attachments is an array so I could push multiple files. It went like this:
const fileArray = [];
// Collect any existing attachments
if (RequestForm.state.attachments.length > 0) {
for (let i=0; i < RequestForm.state.attachments.length; i++) {
fileArray.push(RequestForm.state.attachments[i]);
}
}
// Add the new one to this.state
fileArray.push(bytes);
// Update the state
RequestForm.setState({
attachments: fileArray,
});
Then, because this.state already contained RequestForm:
this.stores = [
RequestForm,
]
I could reference it as this.state.attachments from there on out. React feature that isn't applicable in vanilla JS. You could build a similar construct in plain JavaScript with a global variable, and push, accordingly, however, much easier:
var fileArray = new Array(); // place at the top, before any functions
// Within your processFile():
var newFileArray = [];
if (fileArray.length > 0) {
for (var i=0; i < fileArray.length; i++) {
newFileArray.push(fileArray[i]);
}
}
// Add the new one
newFileArray.push(bytes);
// Now update the global variable
fileArray = newFileArray;
Then you always just reference fileArray, enumerate it for any file byte strings, e.g. var myBytes = fileArray[0]; for the first file.
This is simple way to convert files to Base64 and avoid "maximum call stack size exceeded at FileReader.reader.onload" with the file has big size.
document.querySelector('#fileInput').addEventListener('change', function () {
var reader = new FileReader();
var selectedFile = this.files[0];
reader.onload = function () {
var comma = this.result.indexOf(',');
var base64 = this.result.substr(comma + 1);
console.log(base64);
}
reader.readAsDataURL(selectedFile);
}, false);
<input id="fileInput" type="file" />
document.querySelector('input').addEventListener('change', function(){
var reader = new FileReader();
reader.onload = function(){
var arrayBuffer = this.result,
array = new Uint8Array(arrayBuffer),
binaryString = String.fromCharCode.apply(null, array);
console.log(binaryString);
console.log(arrayBuffer);
document.querySelector('#result').innerHTML = arrayBuffer + ' '+arrayBuffer.byteLength;
}
reader.readAsArrayBuffer(this.files[0]);
}, false);
<input type="file"/>
<div id="result"></div>
Here is one answer to get the actual final byte array , just using FileReader and ArrayBuffer :
const test_function = async () => {
... ... ...
const get_file_array = (file) => {
return new Promise((acc, err) => {
const reader = new FileReader();
reader.onload = (event) => { acc(event.target.result) };
reader.onerror = (err) => { err(err) };
reader.readAsArrayBuffer(file);
});
}
const temp = await get_file_array(files[0])
console.log('here we finally ve the file as a ArrayBuffer : ',temp);
const fileb = new Uint8Array(fileb)
... ... ...
}
where file is directly the File object u want to read , this has to be done in a async function...

Get docx file contents using javascript/jquery

I want to open / read docx file using client side technologies (HTML/JS).
I have found a Javascript library named docx.js but personally cannot seem to locate any documentation for it.
(http://blog.innovatejs.com/?p=184)
The goal is to make a browser based search tool for docx files and txt files.
With docxtemplater, you can easily get the full text of a word (works with docx only) by using the doc.getFullText() method.
HTML code:
<body>
<button onclick="gettext()">Get document text</button>
</body>
<script src="https://cdnjs.cloudflare.com/ajax/libs/docxtemplater/3.26.2/docxtemplater.js"></script>
<script src="https://unpkg.com/pizzip#3.1.1/dist/pizzip.js"></script>
<script src="https://unpkg.com/pizzip#3.1.1/dist/pizzip-utils.js"></script>
<script>
function loadFile(url, callback) {
PizZipUtils.getBinaryContent(url, callback);
}
function gettext() {
loadFile(
"https://docxtemplater.com/tag-example.docx",
function (error, content) {
if (error) {
throw error;
}
var zip = new PizZip(content);
var doc = new window.docxtemplater(zip);
var text = doc.getFullText();
console.log(text);
alert("Text is " + text);
}
);
}
</script>
I know this is an old post, but doctemplater has moved on and the accepted answer no longer works. This worked for me:
function loadDocx(filename) {
// Read document.xml from docx document
const AdmZip = require("adm-zip");
const zip = new AdmZip(filename);
const xml = zip.readAsText("word/document.xml");
// Load xml DOM
const cheerio = require('cheerio');
$ = cheerio.load(xml, {
normalizeWhitespace: true,
xmlMode: true
})
// Extract text
let out = new Array()
$('w\\:t').each((i, el) => {
out.push($(el).text())
})
return out
}
You can try docxyz.
let {Document} = require('docxyz');
let fileName = 'yourfile.docx';
let document = new Document(fileName);
let text = document.text;
console.log(text);
No tables.
let {Document} = require('docxyz');
let fileName = 'yourfile.docx';
let document = new Document(fileName);
let a = [];
for(let paragraph of document.paragraphs){
a.push(paragraph.text);
}
let text = a.join('\n');
console.log(text);
This solution will give you an array of strings, one element for each paragraph in the docx :
const PizZip = require("pizzip");
const { DOMParser, XMLSerializer } = require("#xmldom/xmldom");
const fs = require("fs");
const path = require("path");
function str2xml(str) {
if (str.charCodeAt(0) === 65279) {
// BOM sequence
str = str.substr(1);
}
return new DOMParser().parseFromString(str, "text/xml");
}
function getParagraphs(content) {
const zip = new PizZip(content);
const xml = str2xml(zip.files["word/document.xml"].asText());
const paragraphsXml = xml.getElementsByTagName("w:p");
const paragraphs = [];
for (let i = 0, len = paragraphsXml.length; i < len; i++) {
let fullText = "";
const textsXml =
paragraphsXml[i].getElementsByTagName("w:t");
for (let j = 0, len2 = textsXml.length; j < len2; j++) {
const textXml = textsXml[j];
if (textXml.childNodes) {
fullText += textXml.childNodes[0].nodeValue;
}
}
paragraphs.push(fullText);
}
return paragraphs;
}
// Load the docx file as binary content
const content = fs.readFileSync(
path.resolve(__dirname, "examples/cond-image.docx"),
"binary"
);
// Will print ['Hello John', 'how are you ?'] if the document has two paragraphs.
console.log(getParagraphs(content));
Source : https://docxtemplater.com/faq/#how-can-i-retrieve-the-docx-content-as-text
If you want to be able to display the docx files in a web browser, you might be interested in Native Documents' recently released commercial Word File Editor; try it at https://nativedocuments.com/test_drive.html
You'll get much better layout fidelity if you do it this way, than if you try to convert to (X)HTML and view it that way.
It is designed specifically for embedding in a webapp, so there is an API for loading documents, and it will sit happily within the security context of your webapp.
Disclosure: I have a commercial interest in Native Documents

Categories

Resources