Remove Improperly Nested Spans from MS Excel Generated HTML Using Javascript - javascript

Below is some mostly cleaned HTML output from Microsoft Excel, but there are a bunch of improperly nested <span> elements in there. What would be the best way to remove these elements using Javascript (ideally without relying on jQuery)?
I am already doing a lot of cleaning to get to this point, but removing the <span> elements is proving to be a challenge. Thanks for any advice you can offer!
<table>
<tbody data-key="10020">
<tr data-key="10009">
<span data-key="10002"><span data-offset-key="10002-0">
</span></span>
<td data-key="10004" style="text-align: left;"><span data-key="10003"><span data-offset-key="10003-0">Done</span></span></td>
<span data-key="10005"><span data-offset-key="10005-0">
</span></span>
<td data-key="10007" style="text-align: left;"><span data-key="10006"><span data-offset-key="10006-0">Yes</span></span></td>
<span data-key="10008"><span data-offset-key="10008-0">
</span></span>
</tr>
<tr data-key="10018">
<span data-key="10011"><span data-offset-key="10011-0">
</span></span>
<td data-key="10013" style="text-align: left;"><span data-key="10012"><span data-offset-key="10012-0">Done</span></span></td>
<span data-key="10014"><span data-offset-key="10014-0">
</span></span>
<td data-key="10016" style="text-align: left;"><span data-key="10015"><span data-offset-key="10015-0">Yes</span></span></td>
<span data-key="10017"><span data-offset-key="10017-0">
</span></span>
</tr>
</tbody>
</table>

I know you prefer a solution without relying on jQuery, But, This is here, as the fallback solution, if no one offers using pure javascript.
$(document).ready(function(){
$('tr > span').remove();
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<table>
<tbody data-key="10020">
<tr data-key="10009">
<span data-key="10002"><span data-offset-key="10002-0">
</span></span>
<td data-key="10004" style="text-align: left;"><span data-key="10003"><span data-offset-key="10003-0">Done</span></span></td>
<span data-key="10005"><span data-offset-key="10005-0">
</span></span>
<td data-key="10007" style="text-align: left;"><span data-key="10006"><span data-offset-key="10006-0">Yes</span></span></td>
<span data-key="10008"><span data-offset-key="10008-0">
</span></span>
</tr>
<tr data-key="10018">
<span data-key="10011"><span data-offset-key="10011-0">
</span></span>
<td data-key="10013" style="text-align: left;"><span data-key="10012"><span data-offset-key="10012-0">Done</span></span></td>
<span data-key="10014"><span data-offset-key="10014-0">
</span></span>
<td data-key="10016" style="text-align: left;"><span data-key="10015"><span data-offset-key="10015-0">Yes</span></span></td>
<span data-key="10017"><span data-offset-key="10017-0">
</span></span>
</tr>
</tbody>
</table>

Assuming are you talking about client side JS running in a browser that has parsed the invalid HTML … you can't. The browser will have already performed error recovery at that point.

With pure ES5 javascript you can remove those elements, after browser rendering, with code below. It needs to be reverse because the returned HTMLCollection object have all elements in order they appear on the document, and it auto update itself to synchronize with the DOM tree. Some reference here.
var spans = document.getElementsByTagName('span')
for (var i = spans.length-1; i >= 0; i--) {
var elm = spans[i]
var parent = elm.parentNode
var txt = elm.innerText
parent.removeChild(elm)
if (txt) {
parent.innerText = txt
}
}
<table>
<tbody data-key="10020">
<tr data-key="10009">
<span data-key="10002"><span data-offset-key="10002-0">
</span></span>
<td data-key="10004" style="text-align: left;"><span data-key="10003"><span data-offset-key="10003-0">Done</span></span></td>
<span data-key="10005"><span data-offset-key="10005-0">
</span></span>
<td data-key="10007" style="text-align: left;"><span data-key="10006"><span data-offset-key="10006-0">Yes</span></span></td>
<span data-key="10008"><span data-offset-key="10008-0">
</span></span>
</tr>
<tr data-key="10018">
<span data-key="10011"><span data-offset-key="10011-0">
</span></span>
<td data-key="10013" style="text-align: left;"><span data-key="10012"><span data-offset-key="10012-0">Done</span></span></td>
<span data-key="10014"><span data-offset-key="10014-0">
</span></span>
<td data-key="10016" style="text-align: left;"><span data-key="10015"><span data-offset-key="10015-0">Yes</span></span></td>
<span data-key="10017"><span data-offset-key="10017-0">
</span></span>
</tr>
</tbody>
</table>
With ES6 you can do as below. Problems described before doesn't apply.
var spans = Array.from(document.getElementsByTagName('span'))
spans.forEach(function (elm) {
var txt = elm.innerText
if (txt) {
elm.parentNode.innerText = txt
}
elm.remove()
})
<table>
<tbody data-key="10020">
<tr data-key="10009">
<span data-key="10002"><span data-offset-key="10002-0">
</span></span>
<td data-key="10004" style="text-align: left;"><span data-key="10003"><span data-offset-key="10003-0">Done</span></span></td>
<span data-key="10005"><span data-offset-key="10005-0">
</span></span>
<td data-key="10007" style="text-align: left;"><span data-key="10006"><span data-offset-key="10006-0">Yes</span></span></td>
<span data-key="10008"><span data-offset-key="10008-0">
</span></span>
</tr>
<tr data-key="10018">
<span data-key="10011"><span data-offset-key="10011-0">
</span></span>
<td data-key="10013" style="text-align: left;"><span data-key="10012"><span data-offset-key="10012-0">Done</span></span></td>
<span data-key="10014"><span data-offset-key="10014-0">
</span></span>
<td data-key="10016" style="text-align: left;"><span data-key="10015"><span data-offset-key="10015-0">Yes</span></span></td>
<span data-key="10017"><span data-offset-key="10017-0">
</span></span>
</tr>
</tbody>
</table>

Related

Getting cells value using JQuery

Getting cell value using JQuery.
I have tried using the below code:
$("#table tr").each(function(){
var result = $(this).find("td:first").html();
alert(result);
});
But it returns string of all the first rows
<table class="table table-bordered">
<thead>
<tr>
<td style="white-space: nowrap" class="form-label">
<span id="lblAppMonth1HeaderYr1" class="form-label-bold"></span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth2HeaderYr1" class="form-label-bold"></span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth3HeaderYr1" class="form-label-bold">Jun-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth4HeaderYr1" class="form-label-bold">Jul-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth5HeaderYr1" class="form-label-bold">Aug-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth6HeaderYr1" class="form-label-bold">Sep-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth7HeaderYr1" class="form-label-bold">Oct-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth8HeaderYr1" class="form-label-bold">Nov-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth9HeaderYr1" class="form-label-bold">Dec-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth10HeaderYr1" class="form-label-bold">Jan-18</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth11HeaderYr1" class="form-label-bold">Feb-18</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth12HeaderYr1" class="form-label-bold">Mar-18</span>
</td>
</tr>
</thead>
<tbody>
I expect the values "jun 17", "Jul 17".... in that order, but the actual output is a string of the rows.
Get the values with text - and use .table not #table:
$(".table td").each(function() {
var result = $(this).text().trim();
if (result) console.log(result);
});
<script src="https://code.jquery.com/jquery-3.3.1.js"></script>
<table class="table table-bordered">
<thead>
<tr>
<td style="white-space: nowrap" class="form-label">
<span id="lblAppMonth1HeaderYr1" class="form-label-bold"></span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth2HeaderYr1" class="form-label-bold"></span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth3HeaderYr1" class="form-label-bold">Jun-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth4HeaderYr1" class="form-label-bold">Jul-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth5HeaderYr1" class="form-label-bold">Aug-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth6HeaderYr1" class="form-label-bold">Sep-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth7HeaderYr1" class="form-label-bold">Oct-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth8HeaderYr1" class="form-label-bold">Nov-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth9HeaderYr1" class="form-label-bold">Dec-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth10HeaderYr1" class="form-label-bold">Jan-18</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth11HeaderYr1" class="form-label-bold">Feb-18</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth12HeaderYr1" class="form-label-bold">Mar-18</span>
</td>
</tr>
</thead>
<tbody>
If you want to collect all the rows, use an array:
var rows = [...$(".table td")].map(e => $(e).text().trim()).filter(e => e);
console.log(rows);
<script src="https://code.jquery.com/jquery-3.3.1.js"></script>
<table class="table table-bordered">
<thead>
<tr>
<td style="white-space: nowrap" class="form-label">
<span id="lblAppMonth1HeaderYr1" class="form-label-bold"></span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth2HeaderYr1" class="form-label-bold"></span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth3HeaderYr1" class="form-label-bold">Jun-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth4HeaderYr1" class="form-label-bold">Jul-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth5HeaderYr1" class="form-label-bold">Aug-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth6HeaderYr1" class="form-label-bold">Sep-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth7HeaderYr1" class="form-label-bold">Oct-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth8HeaderYr1" class="form-label-bold">Nov-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth9HeaderYr1" class="form-label-bold">Dec-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth10HeaderYr1" class="form-label-bold">Jan-18</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth11HeaderYr1" class="form-label-bold">Feb-18</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth12HeaderYr1" class="form-label-bold">Mar-18</span>
</td>
</tr>
</thead>
<tbody>
You can use $(".table td") as selector to loop thru the tds and use text() instead of html() to get the texts
$(".table td").each(function() {
console.log($(this).text().trim());
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<table class="table table-bordered">
<thead>
<tr>
<td style="white-space: nowrap" class="form-label">
<span id="lblAppMonth1HeaderYr1" class="form-label-bold"></span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth2HeaderYr1" class="form-label-bold"></span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth3HeaderYr1" class="form-label-bold">Jun-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth4HeaderYr1" class="form-label-bold">Jul-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth5HeaderYr1" class="form-label-bold">Aug-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth6HeaderYr1" class="form-label-bold">Sep-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth7HeaderYr1" class="form-label-bold">Oct-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth8HeaderYr1" class="form-label-bold">Nov-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth9HeaderYr1" class="form-label-bold">Dec-17</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth10HeaderYr1" class="form-label-bold">Jan-18</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth11HeaderYr1" class="form-label-bold">Feb-18</span>
</td>
<td style="white-space: nowrap">
<span id="lblAppMonth12HeaderYr1" class="form-label-bold">Mar-18</span>
</td>
</tr>
</thead>
<tbody>
You should add id to the table in order to call it #table.Also you can get the span text with text() function.
$(document).ready(function(){
$("table tr").each(function(){
var result = $(this).find('span').text();
//there are span elements that are empty, so i skip these ones
if(result != ''){
alert(result);
}
});
});

Access table cell with javascript in chrome console

I have this table and I want to access the prod_sku value and define it as a variable:
<tr class="border first last" id="order-item-row-386470">
<td><h3 class="product-name">prod_name</h3>
</td>
<td data-rwd-label="SKU">prod_sku</td>
<td class="a-right" data-rwd-label="Preço">
<span class="price-excl-tax">
<span class="cart-price">
<span class="price">prod_price</span>
</span>
</span>
<br>
</td>
</tr>
I get this table with this line:
data = document.getElementsByClassName("odd")[0].innerHTML;
From this tbody
<tbody class="odd">
<tr class="border first last" id="order-item-row-386470">
<td><h3 class="product-name">prod_name</h3>
</td>
<td data-rwd-label="SKU">prod_sku</td>
<td class="a-right" data-rwd-label="Preço">
<span class="price-excl-tax">
<span class="cart-price">
<span class="price">prod_price</span>
</span>
</span>
<br>
</td>
<td class="a-right" data-rwd-label="Qtd">
<span class="nobr">
Solicitado: <strong>1</strong><br>
</span>
</td>
<td class="a-right last" data-rwd-label="Subtotal">
<span class="price-excl-tax">
<span class="cart-price">
<span class="price">prod_price</span>
</span>
</span>
<br>
</td>
</tr>
</tbody>
How can I access prod_sku?
You can use a query selector to match the data-rwd-label attribute.
document.querySelector('td[data-rwd-label=SKU]').innerHTML
If you don't have to support stone age browsers, then you can use querySelector to grab your td based on data attribute. If you have to support a myriad of outdated browsers and you need to d a lot of these types of lookups, then jquery is quite good at it.
var data = document.querySelector('td[data-rwd-label="SKU"]').innerHTML;
But would be more interesting to improve the html structure of your table.

Greasemonkey script to add textbox to each row on existing site

I'm trying to figure out how to add a textbox to the last column of each row on a site. My Java/JQuery experience is quite limited and can't really wrap my head around this one.
I've pasted the code for one of the tables below.
I've tried getElementsByClassName, but not sure what to call.
Any help would be much appreciated!
<tbody>
<tr id="g_1_hKWlkSGU" class="tr-first stage-scheduled" style="cursor: pointer;" title="">
<td class="cell_ib icons left">
<span class="icons left">
<span class="tomyga icon0">
</span>
</span>
<div data-context="g:g_1_hKWlkSGU" class="mg_dropdown">
<div class="mg_dropdown_wrapper">
<span class="mg_dropdown_selected">-</span><span class="down_arrow">
</span>
</div>
</div>
</td>
<td class="cell_ad time">19:45</td>
<td class="cell_aa timer">
<span> </span>
</td>
<td class="cell_ab team-home"><span class="padr">Barcelona</span>
</td>
<td class="cell_sa score">-</td>
<td class="cell_ac team-away">
<span class="padl">Celtic</span>
</td>
<td class="cell_sb part-top">
<span class="icons">
<span class="live-centre">
</span>
</span>
</td>
<td class="cell_ia icons">
<span class="icons">
<span class="tv icon1">
</span>
</span>
</td>
<td class="cell_oq comparison" title="">
<span class="icons" title="">
<span class="slive icon0 xxx" title="This match will be available for LIVE betting!">
</span>
</span>
</td>
</tr>
</tbody>

Simple jQuery selector isn't working

I'm trying to parse this HTML:
<tr id="a">
<td class="classA">
<span class="classB">Toronto</span>
</td>
<td class="classC">
<span class="classD">Winnipeg</span>
</td>
</tr>
<tr id="b">
<td class="classA">
<span class="classB">Montreal</span>
</td>
<td class="classC">
<span class="classD">Calgary</span>
</td>
</tr>
I have a variable team. I want to find the <span> that contains team. Then I want to navigate up to the <tr> and pull the id from it.
I tried:
var team = "Toronto";
var id = $("span:contains(" + team + ")").parent().parent().attr('id');
But it comes back undefined. I know the selector is right, because $("span:contains(" + team + ")").attr('class') comes back with classB. So I can't figure out what's wrong with my query. Can anyone help?
Edit: Here's the JSFiddle.
Your html is invalid but your selector is correct, you need to put tr in table tag for valid html. You better use closest("tr") instead of .parent().parent()
Live Demo
<table>
<tr id="a">
<td class="classA"> <span class="classB">Toronto</span>
</td>
<td class="classC"> <span class="classD">Winnipeg</span>
</td>
</tr>
<tr id="b">
<td class="classA"> <span class="classB">Montreal</span>
</td>
<td class="classC"> <span class="classD">Calgary</span>
</td>
</tr>
</table>
It's not working becuase the browser's automatically fixing your HTML. You can't have a TR without a table so it's just throwing it away. All that's actually part of the DOM by the time your JavaScript runs is the spans.
Wrap it in a <table> and your code will work. Even better wrap it in <table><tbody> because the browser will still be making a tbody for you with just a table & that might cause confusion next (If you look at the parent of the TR).
Currently your HTML markup is invalid, you need to wrap <tr> element inside <table>:
<table>
<tr id="a">
<td class="classA"> <span class="classB">Toronto</span>
</td>
<td class="classC"> <span class="classD">Winnipeg</span>
</td>
</tr>
<tr id="b">
<td class="classA"> <span class="classB">Montreal</span>
</td>
<td class="classC"> <span class="classD">Calgary</span>
</td>
</tr>
</table>
Also, it's better to use .closest() as well as .prop() instead of .parent() and .attr()
var id = $("span:contains(" + team + ")").closest('tr').prop('id');
Fiddle Demo
try:
var id = $("span:contains(" + team + ")").parent('td').parent('tr').attr('id');
Your code works good.Just wrap your code into <table></table>
HTML
<table>
<tr id="a">
<td class="classA">
<span class="classB">Toronto</span>
</td>
<td class="classC">
<span class="classD">Winnipeg</span>
</td>
</tr>
<tr id="b">
<td class="classA">
<span class="classB">Montreal</span>
</td>
<td class="classC">
<span class="classD">Calgary</span>
</td>
</tr>
</table>
Script
var team = "Toronto";
var id = $("span:contains(" + team + ")").closest('tr').prop('id');
console.log(id)
http://jsfiddle.net/7Eh7L/
Better use closest()
$("span:contains(" + team + ")").closest('tr').prop('id');

Nested containerless foreach in tbody failing for IE

I'm having a bit of trouble getting some nested, containerless foreach bindings to work. It works in grownup browsers, but not IE (8 OR 9).
The closest I could find was this question, but the root of that problem seems to be a lack of a tbody tag, which I have. The error IE is giving is
Cannot find closing comment tag to match: ko foreach: seniors
Sorry for the wall of text, but below is my markup.
<tbody data-bind="foreach: superGroups">
<tr>
<td style="font-weight: bold;" data-bind="text: superName() || 'No Super Set'" colspan="8">
</tr>
<!-- ko foreach: seniors -->
<tr>
<td></td>
<td style="font-weight: bold;" data-bind="text: seniorName() || 'No Senior Set'" colspan="7"></td>
</tr>
<!-- ko foreach: items -->
<tr>
<td>
<span data-bindX="text:superName"></span>
</td>
<td>
<span data-bindX="text:seniorName"></span>
</td>
<td>
<span data-bind="text:clientName"></span>
<i class="icon-tags" data-bind="attr:{title: labels}, visible: labels"></i>
</td>
<td>
<span data-bind="text:description"></span>
</td>
<td>
<span data-bind="visible:superPayAmount">$<span data-bind="text:superPayAmount"></span></span>
<span data-bind="visible:superPayAmount.length == 0">-</span>
</td>
<td>
<span data-bind="shortDate: superStartDate"></span> - <span data-bind="shortDate: superEndDate"></span>
</td>
<td>
<span data-bind="visible:seniorPayAmount">$<span data-bind="text:seniorPayAmount"></span></span>
<span data-bind="visible:!seniorPayAmount.length == 0">-</span>
</td>
<td>
<span data-bind="shortDate: seniorStartDate"></span> - <span data-bind="shortDate: seniorEndDate"></span>
</td>
</tr>
<!-- /ko -->
<!-- /ko -->
</tbody>
You missed closing td tag in the first tr:
<tr>
<td style="font-weight: bold;" data-bind="text: superName() || 'No Super Set'" colspan="8"></td>
</tr>

Categories

Resources