Parsing html with cheerio - javascript

I used cheerio for the first time today
This is a simplified version of the html source I want.
<div id="country-table">
<!-- div duplicate cause style -->
<div>
<div>
<table>
<tbody>
<tr>
<td>1</td>
<td>USA</td>
<td>1.6</td>
<td>75.8</td>
<td>132,000</td>
</tr>
<tr>
<td>2</td>
<td>INDIA</td>
<td>12123</td>
<td>1322</td>
<td>123213</td>
</tr>
<tr>
<td>3</td>
<td>BRAZIL</td>
<td>3123</td>
<td>213123</td>
<td>134</td>
</tr>
<tr>
<!-- and more... -->
</tbody>
</table>
</div>
</div>
</div>
and i tried to this:
const axios = require("axios").default;
const cheerio = require("cheerio").default;
axios.get("https://coronaboard.kr").then((html) => {
const arr = [];
const $ = cheerio.load(html.data, { xml: true, xmlMode: true });
const data = $("#country-table>div>div>table>tbody").each((index, item) => {
arr.push(item);
});
console.log(arr);
});
I want to put information in td into tr.
ex){number:x,name:USA,confirmed:x,and more...}
If anyone knows how to do it, please answer me!

If you're wanting to extract the data from the table, then this will help. Follow the comments to help you understand how it works.
var $ = cheerio.load(html.data);
// targets the specific table with a selector
var html_table = $('#country-table>div>div>table');
// gets table cell values; loops through all tr rows
var table_data = html_table.find('tr').map(function() {
// gets the cells value for the row; loops through each cell and returns an array of values
var cells = $(this).find('td').map(function() {return $(this).text().trim();}).toArray();
// returns an array of the cell data collected
return [cells];
}).toArray();
// output table data
console.log('table_data', table_data);

Related

Automatically Add rows to the Table when new data comes from a The Websocket with Javascript

I am new to JavaScript, not sure if this very basic question. I am trying to create a Bitcoin Price update dashboard using the data fetched from the external WebSocket. I managed to get the data from the WebSocket. The price updates every seconds, I am not sure how should I push the row data into a HTML table dynamically. I tried to iterate the array but still I am not able to proceed.
I have provided the code snippets below as well as external Websocket from where I am pulling the data.
Please let me know how should I insert the row dynamically into a HTML table. Thank you so much in advance.
<body>
<table>
<thead>
<tr>
<th scope="col">Price</th>
</tr>
</thead>
<tbody id="pricetable">
</tbody>
</table>
<script>
var binanceSocket = new WebSocket("wss://stream.binance.com:9443/ws/btcusdt#trade");
binanceSocket.onmessage = function (event) {
var messageObject = JSON.parse(event.data)
console.log(messageObject.p);
var table = document.getElementById('pricetable')
}
</script>
</body>
Assuming that you have your table in HTML ready with the row for Bitcoin as below. Then just select the <td> cell for price and format the figure accordingly before inserting to it's textContent.
function insertRow(price){
var tr = document.createElement("tr"),
tdCoin = document.createElement("td"),
tdPrice = document.createElement("td"),
docFrag = new DocumentFragment();
tdCoin.textContent = "BTC";
tdPrice.textContent = `${Number(price.slice(0,-6)).toLocaleString("en-US",{style: 'currency', currency: 'USD'})}`;
tr.appendChild(tdCoin);
tr.appendChild(tdPrice);
docFrag.appendChild(tr);
return docFrag;
}
var binanceSocket = new WebSocket("wss://stream.binance.com:9443/ws/btcusdt#trade"),
table = document.getElementById("pricetable");
binanceSocket.onmessage = function(event) {
var messageObject = JSON.parse(event.data);
table.appendChild(insertRow(messageObject.p));
}
<body>
<table>
<thead>
<tr>
<th>Coin</th>
<th scope="col">Price</th>
</tr>
</thead>
<tbody id="pricetable">
</tbody>
</table>
</body>
Add an id to your table, so you can properly access it.
<table id="priceTable">
Then add the new data like so (since i dont know the shape of messageObject.p I am assuming it is a string):
var binanceSocket = new WebSocket("wss://stream.binance.com:9443/ws/btcusdt#trade");
binanceSocket.onmessage = function (event) {
var messageObject = JSON.parse(event.data);
console.log(messageObject.p);
var table = document.getElementById('priceTable').getElementsByTagName('tbody')[0];
var newRow = table.insertRow(table.rows.length);
newRow.innerHtml = `<p>${messageObject.p}</p>`;
}
I have flagged this post as a duplicate of this one. However OP needed a little more help on how to apply the answer to their situation. So I put an answer up

How can I get clickable elements from a table using Puppeteer?

I'm trying to scrape a website and there is a table with clickable elements and text. I managed to use this to grab the innerText of the table elements:
const result = await page.$$eval('tableselector tr', rows => {
return Array.from(rows, row => {
const columns = row.querySelectorAll('td');
return Array.from(columns, column => column.innerText);
});
});
I've tried just returning columns and using result[row][column].getProperty('innerText').jsonValue() to try and grab the innerText but it doesn't work. Could someone explain where I'm going wrong?
EDIT:
Here is an HTML Segment that represents the structure of the table I am trying to scrape.
<table id = "table_id">
<body>
<!-- input button is the clickable element I want to grab -->
<tr class = "GridRowStyle">
<td>input button</td><td>text2</td><td>text3</td><td>text4</td><td>text5</td><td>text6</td><td>text7</td>
</tr>
<tr class = "GridAlternatingStyle">
<td>input button</td><td>text2</td><td>text3</td><td>text4</td><td>text5</td><td>text6</td><td>text7</td>
</tr>
<tr class = "GridRowStyle">
<td>input button</td><td>text2</td><td>text3</td><td>text4</td><td>text5</td><td>text6</td><td>text7</td>
</tr>
</body>

Merging duplicated rows in JSP table

I'm trying to generate a HTML table from an input Excel sheet using Apache POI in a JSP page. I have managed to code the part where the data is fetched from Excel and displayed as a HTML table, but the problem is some of the primary IDs has been duplicated in severals rows but they have different values in other rows. Example (2 Johns with Different Lastname):
<table>
<tr>
<td>Jill</td>
<td>Smith</td>
<td>50</td>
</tr>
<tr>
<td>John</td>
<td>Jackson</td>
<td>94</td>
</tr>
<tr>
<td>John</td>
<td>Doe</td>
<td>80</td>
</tr>
</table>
The code to generate the table :
out.println("<table>");
while (rowIter.hasNext())
{
row =(HSSFRow)rowIter.next();
input_fname = row.getCell(0);
input_lname = row.getCell(1);
input_age = row.getCell(2);
fname = input_fname.getRichStringCellValue().getString();
lname = input_lname.getRichStringCellValue().getString();
age = input_age.getRichStringCellValue().getString();
out.println("<tr>");
out.println("<td>"+fname+"</td>");
out.println("<td>"+lname+"</td>");
out.println("<td>"+age+"</td>");
out.println("</tr>");
}
}
out.println("</table>");
Please anyone advise me how can I merge the duplicated rows according to the Primary ID, First Name as below :
<table>
<tr>
<td>Jill</td>
<td>Smith</td>
<td>50</td>
</tr>
<tr>
<td rowspan="2">John</td>
<td>Jackson</td>
<td>94</td>
</tr>
<tr>
<td>Doe</td>
<td>80</td>
</tr>
</table>
I have tried searching for similar question but i couldn't find any solution for my problem and I'm quite a beginner in Javascript and JQuery (maybe that's the problem). Any suggestions is much appreciated. Thanks in advance!
You are asking the wrong question. Wouldn't it be much easier to write the HTML correctly in the first place rather than try to do some merge on the HTML?
Thus, loop over the entries and put them in some suitable datastructure e.g. a Map keyed by fname and with a list as the value. Person class is a simple bean to hold the data.
Map<String, List<Person>> people = new TreeMap<String, List<Person>> ();
while (rowIter.hasNext())
{
row =(HSSFRow)rowIter.next();
input_fname = row.getCell(0);
input_lname = row.getCell(1);
input_age = row.getCell(2);
fname = input_fname.getRichStringCellValue().getString();
lname = input_lname.getRichStringCellValue().getString();
age = input_age.getRichStringCellValue().getString();
Person person = new Person(fname, lname, age);
if(! people.containsKey(person.fname)){
people.put(person.fname, new ArrayList<Person>());
}
people.get(person.fname).add(person);
}
}
Then loop over this data structure and write the HTML:
for(String key : people.keySet()){
List<Person> persons = people.get(key));
int rowSpan = persons.size();
//write the HTML accordingly.
}
You can:
add a class when printing the name in your back-end code e.g. out.println("<td class="fname">"+fname+"</td>");
and then with jQuery
var last_selected_name = "";
/* Get all the first names cells */
jQuery('td.fname').each(function(i,obj){
current_name = jQuery(obj).text();
/* check for repeated names */
if (current_name == last_selected_name)
{
jQuery("td.fname:contains('"+current_name+"')").each(function(index,object){
if (index == 0)
{
/* check if already has rowspan attribtue */
row_span = jQuery(object).attr('rowspan')?jQuery(object).attr('rowspan'):1;
/* add one */
row_span++;
/* include the new rowspan number */
jQuery(object).attr('rowspan',row_span)
}
else
{
/* delete the other first name cells */
jQuery(object).remove();
}
})
}
last_selected_name = current_name;
})

Convert table HTML to JSON

I have this:
<table>
<tr>
<th>Name:</th>
<td>Carlos</td>
</tr>
<tr>
<th>Age:</th>
<td>22</td>
</tr>
</table>
And I need a JSON format.
{"Name":"Carlos","Age": 22}
I've tried with https://github.com/lightswitch05/table-to-json but it doesn't work for the headings in every row :(
EDIT: http://jsfiddle.net/Crw2C/773/
You can convert the table in the OP to the required format by first converting it to an Object, then using JSON.stringify to get the required string:
<table id="t0">
<tr>
<th>Name:</th>
<td>Carlos</td>
</tr>
<tr>
<th>Age:</th>
<td>22</td>
</tr>
</table>
<script>
function tableToJSON(table) {
var obj = {};
var row, rows = table.rows;
for (var i=0, iLen=rows.length; i<iLen; i++) {
row = rows[i];
obj[row.cells[0].textContent] = row.cells[1].textContent
}
return JSON.stringify(obj);
}
console.log(tableToJSON(document.getElementById('t0'))); // {"Name:":"Carlos","Age:":"22"}"
</script>
However, that is an ad hoc solution, so will need some work to be adapted to a more general case. It shows the concept though.
Note that there is no guarantee that the object properties will be returned in the same order as they appear in the table, you may get {"Age:":"22","Name:":"Carlos"}.
Assuming all you need is to get the first/second cells of each row as key/value pairs, you can use .reduce() to iterate of the rows and just grab the text content of .cells[0] and .cells[1] to use as each key/value pair:
var t = document.querySelector("table");
var j = [].reduce.call(t.rows, function(res, row) {
res[row.cells[0].textContent.slice(0,-1)] = row.cells[1].textContent;
return res
}, {});
document.querySelector("pre").textContent = JSON.stringify(j, null, 2);
<table>
<tr>
<th>Name:</th>
<td>Carlos</td>
</tr>
<tr>
<th>Age:</th>
<td>22</td>
</tr>
</table>
<pre></pre>
The Array.prototype.reduce method takes a collection and uses an accumulator to reduce it down to whatever state you want. Here we just reduce it to an object, so we pass one in after the callback.
For every row, we use the first cell's content as the object key, and the second cell's content as the value. We then return the object from the callback so that it's given back to us in the next iteration.
Finally, .reduce() returns the last thing we returned (which of course is the object we started with), and that's your result.
var t = document.querySelector("table");
var j = [].reduce.call(t.rows, function(res, row) {
res[row.cells[0].textContent.slice(0,-1)] = row.cells[1].textContent;
return res
}, {});
document.querySelector("pre").textContent = JSON.stringify(j);
<table>
<tr>
<th>Name:</th>
<td>Carlos</td>
</tr>
<tr>
<th>Age:</th>
<td>22</td>
</tr>
</table>
<pre></pre>
The Table-to-JSON library that you are using is expecting a different format in your table.
It is expecting a table with all of your headers in the first row, followed by the data in subsequent rows.
In other words, it's expecting your table to be structured like this
<table>
<tr>
<th>Name</th>
<th>Age</th>
</tr>
<tr>
<td>Carlos</td>
<td>22</td>
</tr>
</table>
Here's a forked version of your JSFiddle in which this is working.

JSON Object into Mustache.js Table

I'm trying to create a table with a JSON Object using Mustache.js.
I wanted it to show two rows, however it's only showing the second row only.
I suspect that the first row is being overwritten by the second when it's being bound again in the loop.
How do I work my way around it? Or is there a better structure I should follow?
Javascript:
var text = '[{"Fullname":"John", "WorkEmail":"john#gmail.com"},{"Fullname":"Mary", "WorkEmail":"mary#gmail.com"}]'
var obj = JSON.parse(text);
$(document).ready(function() {
var template = $('#user-template').html();
for(var i in obj)
{
var info = Mustache.render(template, obj[i]);
$('#ModuleUserTable').html(info);
}
});
Template :
<script id="user-template" type="text/template">
<td>{{FullName}}</td>
<td>{{WorkEmail}}</td>
</script>
table:
<table border="1">
<tr>
<th>FullName</th>
<th>WorkEmail</th>
</tr>
<tr id = "ModuleUserTable">
</tr>
</table>
In additon to your own solution, you should consider using mustache to repeat the row for you:
<script id="user-template" type="text/template">
{{#people}}
<tr>
<td>{{FullName}}</td>
<td>{{WorkEmail}}</td>
</tr>
{{/people}}
</script>
var text = '[{"Fullname":"John", "WorkEmail":"john#gmail.com"},{"Fullname":"Mary", "WorkEmail":"mary#gmail.com"}]'
var obj = {people: JSON.parse(text)};
$(document).ready(function() {
var template = $('#user-template').html();
var info = Mustache.render(template, obj);
$('#ModuleUserTable').html(info);
});
I figured out that instead of
$('#ModuleUserTable').html(info);
it should be :
$('#ModuleUserTable').append(info);
Template should be :
<script id="user-template" type="text/template">
<tr>
<td>{{FullName}}</td>
<td>{{WorkEmail}}</td>
</tr>
</script>
and ID should not be on the table row tag. Instead it should be on the table itself:
<table border="1" id = "ModuleUserTable>
<tr>
<th>FullName</th>
<th>WorkEmail</th>
</tr>
</table>
The moment when it appends, it adds a new row into the table with the JSON data.

Categories

Resources