Loading csv files content into dictionary of objects

Loading csv files content into dictionary of objects - javascript

I'm trying to load the content of the several CSV files into a new array. CSV files have a typical structure, with a label in the first row, and values (both string and real numbers) separated by commas. This part of code is responsible for loading the data for future use with Google Maps Api (not a problem for now, since I'm stuck on just loading the data). I would like to have a structure, in which I could call an element by it's name, that's why the var nodedata = {}; is created.
So the thing I totally don't get is why some part of the code is not being executed at all? console.log(nodedata); is empty, at least not in my Firefox console.
That's my attempt to the problem - links to the csv files are in the code.
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.js"></script>
<script src="http://jquery-csv.googlecode.com/files/jquery.csv-0.71.js"></script>
<script type="text/javascript">
var nodes = {};
var generation = {};
var nodedata = {};
$.get('https://dl.dropboxusercontent.com/u/25575808/energy/nodes.csv', function (response) {
nodes = $.csv.toObjects(response);
console.log(nodes);
});
$.get('https://dl.dropboxusercontent.com/u/25575808/energy/generation.csv', function (response) {
generation = $.csv.toObjects(response);
console.log(generation);
});
function getGeneration (nodename){
gen = 0;
for (var i = 0; i < generation.length; i++) {
if (generation[i].datetime == "2013-01-01 01:00"){
if (generation[i].node == nodename){
gen = gen + Number(generation[i]["output (MW)"])
}
}
}
return gen;
}
for (var i = 0; i < nodes.length; i++) {
nodedata[nodes[i].Node] = {
center: new google.maps.LatLng(nodes[i].Latitude,nodes[i].Longitude),
nodegen : getGeneration(nodes[i].Node)
}
}
console.log(nodedata);

I believe the problem you're having is unrelated to the usage of CSV data, rather it is the fact that the data is being loaded asynchronously.
You are executing 2 $.get() requests to load the files, which will take some time to download the files. The browser does not wait for them to finish before continuing through the rest of the code.
Therefore, it is possible for console.log(nodedate) to be executed before any data exists inside the nodes array.
An easy way to handle this is to stack your callback functions so that the first GET request completes -> run the 2nd GET request -> finally, run the processing code.
Check out this reorganization of the code: http://jsfiddle.net/Vr7sw/
(I removed the Google Maps line since I don't have the library loaded)

the problem is, the $.get requests are asynchronous (see jquery documentation), try to call to a function, into your callback body like this :
function nodesToJson(nodes) {
for (var i = 0; i < nodes.length; i++) {
nodedata[nodes[i].Node] = {
center: new google.maps.LatLng(nodes[node].Latitude,nodes[node].Longitude),
nodegen : getGeneration(nodes[i].Node)
}
}
console.log(nodedata);
}
$.get('https://dl.dropboxusercontent.com/u/25575808/energy/nodes.csv', function (response) {
nodes = $.csv.toObjects(response);
//when the request are ready, process the nodes
nodesToJson(nodes);
});

Related

PDFJS stopAtErrors does not stop execution when encountering PDF parsing errors

I am using PDFJS to get textual data from PDF files, but occasionally encountering the following error:
Error: Invalid XRef table: unexpected first object.
I would prefer that my code just skip over problem files and continue on to the next file in the list. According to PDFJS documentation, setting stopAtErrors to true for the DocumentInitParameters in PDFJS should result in rejection of getTextContent when the associated PDF data cannot be successfully parsed. I am not finding such to be the case: even after setting stopAtErrors to true, I continue to get the above error and the code seems to be "spinning" on the problem file rather than just moving on to the next in the list. It is possible that I haven't properly set stopAtErrors to true as I think I have. A snippet of my code is below to illustrate what I think I've done (code based on this example):
// set up the variables to pass to getDocument, including the pdf file's url:
var obj = {};
obj.url = http://www.whatever.com/thefile.pdf; // the specific url linked to desired pdf file goes here
obj.stopAtErrors = true;
// now have PDF JS read in the file:
PDFJS.getDocument(obj).then(function(pdf) {
var pdfDocument = pdf;
var pagesPromises = [];
for (var i = 0; i < pdf.pdfInfo.numPages; i++) {
(function (pageNumber) {
pagesPromises.push(getPageText(pageNumber, pdfDocument));
}) (i+1);
}
Promise.all(pagesPromises).then(function(pagesText) {
// display text of all the pages in the console
console.log(pagesText);
});
}, function (reason) {
console.log('Error! '+reason);
});
function getPageText(pageNum, PDFDocumentInstance) {
return new Promise(function (resolve, reject) {
PDFDocumentInstance.getPage(pageNum).then(function(pdfPage) {
pdfPage.getTextContent().then(function(textContent) { // should stopAtErrors somehow be passed here to getTextContent instead of to getDocument??
var textItems = textContent.items;
var finalString = '';
for (var i = 0; i < textItems.length; i++) {
var item = textItems[i];
finalString += item.str + " ";
}
resolve(finalString);
});
});
}).catch(function(err) {
console.log('Error! '+err);
});
}
One thing I am wondering is if the stopAtErrors parameter should somehow instead be passed to getTextContent? I have not found any examples illustrating the use of stopAtErrors and the PDFJS documentation does not show a working example, either. Given that I am still at the stage of needing examples to get PDFJS to function, I am at a loss as to how to make PDFJS stop trying to parse a problem PDF file and just move on to the next one.

Looping alternative in Javascript/jQuery with AJAX

I have the following piece of code of which I'm worried for performance wise. I'm not sure if it's a good idea to loop through $.ajax just like that. Is there a more efficient way to loop through an array in jQuery ajax?
What this code is supposed to do:
This code is supposed to take a bunch of URLs through a text area and if the URLs are broken into new lines, then each URL will be part of the urls_ary array. Otherwise, if there is not line break and the entered text area value is an URL, the value will be stored in single_url.
Now, I need to send these URLs (or URL) to my server-side script (PHP) and process those links. However, if the array urls_ary is the one to be sending data through AJAX, I'd need to send each URL individually, causing me to run the $.ajax call inside a for loop, which I think is inefficient.
var char_start = 10;
var index = 0;
var urls = $('textarea.remote-area');
var val_ary = [];
var urls_ary = [];
var single_url = '';
urls.keyup(function(){
if (urls.val().length >= char_start)
{
var has_lbrs = /\r|\n/i.test(urls.val());
if (has_lbrs) {
val_ary = urls.val().split('\n');
for (var i = 0; i < val_ary.length; i++)
{
if (!validate_url(val_ary[i]))
{
continue;
}
urls_ary[i] = val_ary[i];
}
}
else
{
if (validate_url(urls.val()))
{
single_url = urls.val();
}
}
if (urls_ary.length > 0)
{
for (var i = 0; i < urls_ary.length; i++)
{
$.ajax({
// do AJAX here.
});
}
}
else
{
$.ajax({
// do AJAX here.
});
}
}
});
function validate_url(url)
{
if(/^([a-z]([a-z]|\d|\+|-|\.)*):(\/\/(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*#)?((\[(|(v[\da-f]{1,}\.(([a-z]|\d|-|\.|_|~)|[!\$&'\(\)\*\+,;=]|:)+))\])|((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=])*)(:\d*)?)(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)*)*|(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)*)*)?)|((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)*)*)|((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)){0})(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?$/i.test(url)){
return true;
}
return false;
}

Doing the $.ajax calls in a loop isn't the inefficient part. The AJAX requests will queue up, waiting for an available connection (only a certain number of requests per connection are allowed at a time). What's inefficient is the fact that you're doing multiple AJAX calls. Ideally, you could add the ability on the server to process multiple URLs at a time, then post an array of URLs in your client code instead of doing multiple requests.
So basically, the only way to be more efficient is to change the server-side code, then rewriting the client code should be straightforward.

How to using javascript to read files into an array in order

I have a bunch of text files on server side with file names 0.txt, 1.txt, 2.txt, 3.txt and so forth. I want to read the content of all files and store them in an array A, such that A[0] has 0.txt's content, A[1] has 1.txt's, ...
How can I do it in Javascript / jquery?
Originally, I used $.ajax({}) in jQuery to load those text files. But it didn't work, because of the asynchronous nature of ajax. I tried to set $.ajax({...async=false...}), but it was very slow -- I have ~1000 10KB files to read in total.

from your question, you want to load txt file from server to local:
var done = 0, resultArr = [], numberOfFiles = 1000;
function getHandler(idx) {
return function(data) {
resultArr[idx] = data;
done++;
if (done === numberOfFiles) {
// tell your other part all files are loaded
}
}
}
for (var i = 0; i < numberOfFiles; i++) {
$.ajax(i + ".txt").done(getHandler(i));
}
jsFiddle: http://jsfiddle.net/LtQYF/1/

What you're looking for is File API introduced in HTML5 (working draft).
The examples in this article will point you in the right direction. Remember that the end user will have to initiate the action and manually select the files - otherwise it would have been a terrible idea privacy- and security-wise.
Update:
I found (yet again) the mozilla docos to be more readable! Quick html mockup:
<input type="file" id="files" name="files[]" onchange="loadTextFile();" multiple/>
<button id="test"onclick="test();">What have we read?</button>
...and the JavaScript:
var testArray = []; //your array
function loadTextFile() {
//this would be tidier with jQuery, but whatever
var _filesContainer = document.getElementById("files");
//check how many files have been selected and iterate over them
var _filesCount = _filesContainer.files.length;
for (var i = 0; i < _filesCount; i++) {
//create new FileReader instance; I have to read more into it
//but I was unable to just recycle one
var oFReader = new FileReader();
//when the file has been "read" by the FileReader locally
//log its contents and push them into an array
oFReader.onload = function(oFREvent) {
console.log(oFREvent.target.result);
testArray.push(oFREvent.target.result);
};
//actually initiate the read
oFReader.readAsText(_filesContainer.files[i]);
}
}
//sanity check
function test() {
for (var i = 0; i < testArray.length; i++) {
console.warn(testArray[i]);
}
}
Fiddled

You don't give much information to give a specific answer. However, it is my opinion that "it doesn't work because of the asynchronous nature of ajax" is not correct. You should be able to allocate an array of the correct size and use a callback for each file. You might try other options such as bundling the files on the server and unbundling them on the client, etc. The designs, that address the problem well, depend on specifics that you have not provided.

Why is this javascript object property undefined?

I am using an approach described in detail at Dictionary Lookups in Javascript (see the section"A Client-Side Solution") to create an object that contains a property for each word in the scrabble dictionary.
var dict = {};
//ajax call to read dictionary.txt file
$.get("dictionary.txt", parseResults);
function parseResults(txt) {
var words = txt.split( "\n");
for (var i=0; i < words.length; i++){
dict[ words[i] ] = true;
}
console.log(dict.AAH);
console.log(dict);
if (dict.AAH == true) {
console.log('dict.AAH is true!');
}
}
(updated code to use an earlier answer from Phil)
I can't figure out why dict.AAH is returning undefined, but the dict object looks fine in the console. Screenshots from Firebug below.
Console:
Drilled down into "Object { }"
How can I check a given word ("AAH", in this case) and have it return true if it is a property in the dict object defined as true?
Live example
Code on Github

The problem isn't your code. You have invisible characters in your words, which you fail to clean up.
You can verify this by using this as your results parser
function parseResults(txt) {
// clean the words when we split the txt
var words = txt.split("\n")
.map($.trim)
.splice(0,3); // Keep only 3 first ones
if(btoa(words[2]) !== btoa('AAH')){ // Compare in Base64
console.log('YOU HAVE HIDDEN CHARS!');
}
}
And you can fix it by whitelisting your characters.
function parseResults(txt) {
// clean the words when we split the txt
var words = txt.split("\n").map(function(el){
return el.match(/[a-zA-Z0-9]/g).join('');
});
for (var i=0; i < words.length; i++){
dict[ words[i] ] = true;
}
console.log(dict.AAH);
console.log(dict);
if (dict.AAH == true) {
console.log('dict.AAH is true!');
}
}
I would recommend cleaning it up on the server side since running regex on every element in an array as large as seen in your live site might cause performance issues.

It's probably a race condition. You're loading the dictionary in a GET and then immediately (while the request is being made) those console.log commands are being called (and the one comes back undefined). Then the data is actually loaded by the time you debug. Everything should be done in a callback or deferred. It's an understandable quirk of debuggers that's caught me up before.

Get ajax requests are asynchronous. This means that while the whole operation that occurs in the ajax request is going, javascript keeps reading the next lines.
The problem then is you are logging values that the ajax request did not manage to retrieve early enough.
To get around the issue you can include the log calls inside your ajax request callback as below
var dict = {};
//ajax call to read dictionary.txt file
$.get("dictionary.txt", function( txt ){
var words = txt.split( "\n");
for (var i=0; i < words.length; i++){
dict[ words[i] ] = true;
}
//Now inside these console.log will run once you DO have the data
console.log(dict.AAH);
console.log(dict);
});
//Stuff out here will run whether or not asynchronous request has finished
I WOULD RECOMMEND USING THE WHEN METHOD IN JQUERY FOR THIS TYPE OF SCENARIOS EVEN MORE AS THE BEST SOLUTION
HERE IS HOW WHAT I THINK WOULD BE MOST PROPER FOR COMPLEX PROJECTS
var dict = {};
//ajax call to read dictionary.txt file
function getDictionary(){
return $.ajax("dictionary.txt");
}
/*I recommend this technique because this will allow you to easily extend your
code to maybe way for more than one ajax request in the future. You can stack
as many asynchronous operations as you want inside the when statement*/
$.when(getDictionary()).then(function(txt){//Added txt here...forgot callback param before
var words = txt.split( "\n");
for (var i=0; i < words.length; i++){
dict[ words[i] ] = true;
}
//Now inside these console.log will run once you DO have the data
console.log(dict.AAH);
console.log(dict);
});

You're trying to output dict before it has been populated by the $.get success handler.
Try this:
// If the browser doesn't have String.trim() available, add it...
if (!String.prototype.trim) {
String.prototype.trim=function(){return this.replace(/^\s\s*/, '').replace(/\s\s*$/, '');};
String.prototype.ltrim=function(){return this.replace(/^\s+/,'');};
String.prototype.rtrim=function(){return this.replace(/\s+$/,'');};
String.prototype.fulltrim=function(){return this.replace(/(?:(?:^|\n)\s+|\s+(?:$|\n))/g,'').replace(/\s+/g,' ');};
}
/**
* Parses the response returned by the AJAX call
*
* Response parsing logic must be executed only after the
* response has been received. To do so, we have to encapsulate
* it in a function and use it as a onSuccess callback when we
* place our AJAX call.
**/
function parseResults(txt) {
// clean the words when we split the txt
var words = txt.split("\n").map($.trim);
for (var i=0; i < words.length; i++){
dict[ words[i] ] = true;
}
console.log(dict.AAH);
console.log(dict);
if (dict.AAH == true) {
console.log('dict.AAH is true!');
}
}
// global object containing retrieved words.
var dict = {};
//ajax call to read dictionary.txt file
$.get("dictionary.txt", parseResults);
As another user commented, jQuery's $.when lets you chain such code.
By the way, if all you want to do is know if a word is in the results you can do:
function parseResults(txt) {
// clean the words when we split the txt
var words = txt.split("\n").map($.trim);
if ($.inArray('AAH', words)) {
console.log('AAH is in the result set');
}
}

I think the problem lays in that you have dict defined as an object but use it as an array.
Replace var dict = {} by var dict = new Array() and your code should work (tried with your live example on Google Chrome).

Returning a value from a jQuery Ajax method

I'm trying to use Javascript in an OO style, and one method needs to make a remote call to get some data so a webpage can work with it. I've created a Javascript class to encapsulate the data retrieval so I can re-use the logic elsewhere, like so:
AddressRetriever = function() {
AddressRetriever.prototype.find = function(zip) {
var addressList = [];
$.ajax({
/* setup stuff */
success: function(response) {
var data = $.parseJSON(response.value);
for (var i = 0; i < data.length; i++) {
var city = data[i].City; // "City" column of DataTable
var state = data[i].State; // "State" column of DataTable
var address = new PostalAddress(postalCode, city, state); // This is a custom JavaScript class with simple getters, a DTO basically.
addressList.push(address);
}
}
});
return addressList;
}
}
The webpage itself calls this like follows:
$('#txtZip').blur(function() {
var retriever = new AddressRetriever();
var addresses = retriever.find($(this).val());
if (addresses.length > 0) {
$('#txtCity').val(addresses[0].getCity());
$('#txtState').val(addresses[0].getState());
}
});
The problem is that sometimes addresses is inexplicably empty (i.e. length = 0). In Firebug the XHR tab shows a response coming back with the expected data, and if I set an alert inside of the success method the length is correct, but outside of that method when I try to return the value, it's sometimes (but not always) empty and my textbox doesn't get populated. Sometimes it shows up as empty but the textbox gets populated properly anyways.
I know I could do this by getting rid of the separate class and stuffing the whole ajax call into the event handler, but I'm looking for a way to do this correctly so the function can be reused if necessary. Any thoughts?

In a nutshell, you can't do it the way you're trying to do it with asynchronous ajax calls.
Ajax methods usually run asynchronous. Therefore, when the ajax function call itself returns (where you have return addressList in your code), the actual ajax networking has not yet completed and the results are not yet known.
Instead, you need to rework how the flow of your code works and deal with the results of the ajax call ONLY in the success handler or in functions you call from the success handler. Only when the success handler is called has the ajax networking completed and provided a result.
In a nutshell, you can't do normal procedural programming when using asynchronous ajax calls. You have to change the way your code is structured and flows. It does complicate things, but the user experience benefits to using asynchronous ajax calls are huge (the browser doesn't lock up during a networking operation).
Here's how you could restructure your code while still keeping the AddressRetriever.find() method fairly generic using a callback function:
AddressRetriever = function() {
AddressRetriever.prototype.find = function(zip, callback) {
$.ajax({
/* setup stuff */
success: function(response) {
var addressList = [];
var data = $.parseJSON(response.value);
for (var i = 0; i < data.length; i++) {
var city = data[i].City; // "City" column of DataTable
var state = data[i].State; // "State" column of DataTable
var address = new PostalAddress(postalCode, city, state); // This is a custom JavaScript class with simple getters, a DTO basically.
addressList.push(address);
}
callback(addressList);
}
});
}
}
$('#txtZip').blur(function() {
var retriever = new AddressRetriever();
retriever.find($(this).val(), function(addresses) {
if (addresses.length > 0) {
$('#txtCity').val(addresses[0].getCity());
$('#txtState').val(addresses[0].getState());
}
});
});

AddressRetriever = function() {
AddressRetriever.prototype.find = function(zip) {
var addressList = [];
$.ajax({
/* setup stuff */
success: function(response) {
var data = $.parseJSON(response.value);
for (var i = 0; i < data.length; i++) {
var city = data[i].City; // "City" column of DataTable
var state = data[i].State; // "State" column of DataTable
var address = new PostalAddress(postalCode, city, state); // This is a custom JavaScript class with simple getters, a DTO basically.
addressList.push(address);
processAddresss(addressList);
}
}
});
}
}
function processAddresss(addressList){
if (addresses.length > 0) {
$('#txtCity').val(addresses[0].getCity());
$('#txtState').val(addresses[0].getState());
}
}
or if you want don't want to make another function call, make the ajax call synchronous. Right now, it is returning the array before the data is pushed into the array

Not inexplicable at all, the list won't be filled until an indeterminate amount of time in the future.
The canonical approach is to do the work in your success handler, perhaps by passing in your own callback. You may also use jQuery's .when.

AJAX calls are asynchroneous, which means they don't run with the regular flow of the program. When you execute
if (addresses.length > 0) {
addresses is in fact, empty, as the program did not wait for the AJAX call to complete.

Develop Reference

JavaScript is the programming language of the Web.

Loading csv files content into dictionary of objects - javascript

Related

PDFJS stopAtErrors does not stop execution when encountering PDF parsing errors

Looping alternative in Javascript/jQuery with AJAX

How to using javascript to read files into an array in order

Why is this javascript object property undefined?

Returning a value from a jQuery Ajax method

Categories

Resources