d3 reading csv/tsv files if column names are numbers? - javascript

Doing a project using data from the world bank. Their data is organized such that they have individual years as column names in their csv. Aside from possibly changing the name, how would I go about accessing them using d3.csv and mapping them into a custom array since I won't use all the values.
For example, the file I'm using is GDP per country. Each element/row is formated like so
"Country Name","Country Code","Indicator Name","Indicator Code", "1960","1961","1962","1963","1964","1965","1966","1967","1968","1969","1970","1971","1972","1973","1974","1975","1976","1977","1978","1979","1980","1981","1982","1983","1984","1985","1986","1987","1988","1989","1990","1991","1992","1993","1994","1995","1996","1997","1998","1999","2000","2001","2002","2003","2004","2005","2006","2007","2008","2009","2010","2011","2012","2013",
If I wanted the GDP values for a Country like United Kingdom, Brazil, China, Russia, US, etc, and for years 2004-2012, how would I go about it?
Would the code looks something like this
d3.csv(URL, function(d) {
return {
AttributeName : d.ColumnName
//Continues for all columns I need
};
});
Again, d.ColumnName won't work if the column name is an actual integer. How would I account for spaces in the column name as shown as well?
How would I also go about displaying the elements properly in say the document itself or the console?
I apologize for so many questions. Feel free to direct me to solutions. Thanks.
Thanks to Lars for the answer. I'm gonna append the next step of what I want to do. Now if I wanted to make a Line Graph using this data, how would I access the resulting array of objects? Again, I only want specific countries out of the 200+ element Array.

You can solve all of these issues by using the alternative syntax for accessing attributes -- instead of
d.foo
you can write
d["foo"]
This also works with spaces, numbers, etc:
d["11111"]
d["string with spaces"]

Related

d3/js data manipulation dropping columns

I have a dataset in csv and it looks like below.
country,col1,col2,col3
Germany,19979188,11233906,43.7719591
UK,3839766,1884423,50.92349378
France,1363608,796271,41.60557873
Italy,957516,557967,41.72765781
I'd like to drop col1, col2 off while keeping country and col3. If possible, I'd like to wrap it into a function where I can pass column list that I'd like to drop/keep.
Using pandas, which I'm familiar with, I can easily do it. e.g. data.drop(['col1', 'col2'], axis = 1). But I found d3 way or js way in general is based on each row so couldn't come up with an idea to drop columns.
I was thinking of d3.map() taking desirable columns only. But I was stuck to build a general function that the column list can be passed in.
Could anyone have thoughts?
D3 fetch methods, like d3.csv, will retrieve the whole CSV and will create an array of objects based on that CSV. Because of that, filtering out some columns is useless. Actually, it's worse than useless: you'll spend time and resources with an unnecessary operation.
Therefore, the only useful solution is, if you have access and own that CSV, creating a new CSV without those columns. That way you'll have a smaller file, faster to load. Otherwise, if you cannot change the CSV itself, don't bother: just load the whole thing and use the columns you want (which will be properties in the objects), ignoring the others.
Finally, if you have a lot of data manipulation it might be interesting reducing the size of the objects in the data array. If that's your case, use a row function to return only the properties you want. For instance:
d3.csv(url, function (d){
return {country: d.country, col3: d.col3}
}).then(etc...)

Excel, is it possible to read from a named range similar to a java array? List[0] for example

I am struggling to find anything on the internet related to this one.
You can easily name a range in excel and it's treated as an array.
An example would be the average formula. You can feed a list from a named range into the formula.
Named Range "list" contains the values:
1,2,3,4,5,6,7,8,9,10
Named Range "list2" contains the values:
2,3,4,5,6,7,8,9,10,11
Currently in excel, this is possible.
=Average(Number1,[Number2], ...)
=Average(List1,List2)
Would it be possible to do ->
Average(list[1],list[6])
I want this to make my formulas more simple.
I have a large list of people and instead of doing
B1!H4, B1!h5
I would love to do N[1], N[2], N[3]
Thank you all!
google-spreadsheet
You can use INDEX():
=AVERAGE(INDEX(list,1),INDEX(list,6))
Where list is a named range.

JSON in localforage, efficient way to update property values (Binarysearch, etc.)?

I would like to come straight to the point and show you my sample data, which is around the average of 180.000 lines from a .csv file, so a lot of lines. I am reading in the .csv with papaparse. Then I am saving the data as array of objects, which looks like this:
I just used this picture as you can also see all the properties my objects have or should have. The data is from Media Transperency Data, which is open source and shows the payments between institiutions.
The array of objects is saved by using the localforage technology, which is basically an IndexedDB or WebSQL with localstorage like API. So I save the data never on a sever! Only in the client!
The Question:
So my question is now, the user can add the sourceHash and/or targetHash attributes in a client interface. So for example assume the user loaded the "Energie Steiermark Kunden GmbH" object and now adds the sourceHash -- "company" to it. So basically a tag. This is already reflected in the client and shown, however I need to get this also in the localforage and therefore rewrite the initial array of objects. So I would need to search for every object in my huge 180.000 lines array that has the name "Energie Steiermark Kunden GmbH", as there can be multiple and set the property sourceHash to "company". Then save it again in the localforage.
The first question would be how to do this most efficient? I can get the data out of localforage by using the following method and set it respectively.
Get:
localforage.getItem('data').then((value) => {
...
});
Set:
localforage.setItem('data', dataObject);
However, the question is how do I do this most efficiently? I mean if the sourceNode only starts with "E" for example we don't need to search all sourceNode's. The same goes of course for the targetNode.
Thank you in advance!
UPDATE:
Thanks for the answeres already! And how would you do it the most efficient way in Javascript? I mean is it possible to do it in few lines. If we assume I have for example the current sourceHash "company" and want to assign it to every node starting with "Energie Steiermark Kunden GmbH" that appear across all timeNode's. It could be 20151, 20152, 20153, 20154 and so on...
Localforage is only a localStorage/sessionStorage-like wrapper over the actual storage engine, and so it only offers you the key-value capabilities of localStorage. In short, there's no more efficient way to do this for arbitrary queries.
This sounds more like a case for IndexedDB, as you can define search indexes over the data, for instance for sourceNodes, and do more efficient queries that way.

Multiple Category Options per Marker Using GeoJSON in Leaflet Map

I'm new to Leaflet and GeoJSON, and haven't been able to find an example of what I want. The map I'm making has multiple markers for various services in the area. I need to add a demographic category to filter by youth, adult, seniors, men, women, etc. Some markers need to be able to have more than one demographic (such as adult and men). The closest example I could find was adding a property for each type of demographic with a true/false option for each marker. Is this the way I should go about it? They may want to add other categories (besides demographic) in the future, which wouldn't really be organized if every option is set as a property. Please help point me in the right direction!
You should first decide how you want to structure your data and this, at first, has nothing to do with Leaflet (which comes later for visualization-mapping).
You can work with either GeoJSON or simple JSON.
For example:
myJSON=[
{
lat:10,
lon:10,
demographic:['youth']
},
{
lat:6,
lon:10,
demographic:['adults','men']
},
{
lat:10,
lon:12,
demographic:['adults']
},
{
lat:7,
lon:8,
demographic:['adults','seniors','women']
}
]
Then, you just have to think of the logic you want to translate into code.
For example, you can have some checkboxes for the user to filter the category. Each time the user checks/unchecks, a loop goes through the array and inserts as markers ONLY the objects that match criteria.
This is an example:
https://jsfiddle.net/xf4fwwme/44/
The code is a bit messy but what I would like to show you is that there can be different approaches and not just one direction. You can achieve the same result with cleaner code, GeoJSON instead of JSON etc...
Hope I helped.

Filtering with search-index

I want to implement a full-text-search for *.epub-Files. Therefore I forked the epub-full-text-search module (https://github.com/friedolinfoerder/epub-full-text-search).
I will have many ebooks to search through, so I want to have a way to only search in a specific ebook one at a time.
How could I do this with search-index. I coded a solution which allows to search in the fields filename (the unique filename of the epub) and body (the content of the chapters), but this doesn't feel like it's the right way to do this and the performance is also not ideal.
Here is an example how I do the search with search-index:
searchIndex.search({
query: [{
AND: [
{body: ['epub']},
{filename: ['accessible_epub_3']}
]
}]
});
Is there a better way to do this. Maybe with buckets, categories and filters?
Thanks for your help!
Search-index, which epub-full-text-search is based on, gives one search result back for each document/item that has a match for any given query. My guess is that you would like to know where in the epub-file you get a hit. If a certain paragraph is a good enough search result item, I would index paragraphs. Each paragraph would have a unique book-key as a filter, and maybe a reference to where it is in the epub-file (page/percentage/etc).
Disclaimer: I'm working on the search-index project.

Categories

Resources