Regex script to write data from a table to a spreadsheet - javascript

I am trying to write a script which will turn a series of basic html tables describing particular variations of certain words in different countries into a working spreadsheet for use in a database. Each table applies to the translations of a single word across countries. In html it takes the format of:
<h5><a name="akas"> equivalent names in different countries </a> </h5>
<table border="0" cellpadding="2">
<tr>
<td>character string </td>
<td> country name / country name / country name</td>
</tr>
<tr>
<td>character string </td>
<td>country name</td>
</tr>
.................. this format continues until the table ends
</table>
Country names are repeating across tables and should represent column headings on the spreadsheet across which the rows of equivalent words lie. I am totally new to regex (which I'm finding really bewildering to get into) and a beginner in Javascript also. Again I am looking for help on how to rearrange this type of data into a working spreadsheet for use in a larger database. If anyone could help me it would be really appreciated.

You should look at DOM parsing and XPath. XPath allows you to query the html file to get the content of whichever node that you need.

You can copy paste an HTML table into a spreadsheet.

Related

JS Performance with lots of dynamic data. innerHTML, outerHTML, or textContent

after reading a lot of posts here, I've come to the conclusion that textContent is faster than innerHTML and outerHTML; but what happens when I have a lot of data that needs to be replaced?
I have a table with dynamic second columns such as follows. Keep in mind a JS function will show/hide metric or imperial <span> when clicked.
When a new variant is selected, data in the second cells will change.
<div class="spec-table" id="SpecTable">
<table class="table">
<tr>
<td>dimension_1_title</td>
<td><span class="spec-table_metric">dimension_1_metric</span><span class="spec-table_imperial">dimension_1_imperial</span></td>
</tr>
<tr>
<td>dimension_2_title</td>
<td><span class="spec-table_metric">dimension_2_metric</span><span class="spec-table_imperial">dimension_2_imperial</span></td>
</tr>
<tr>
<td>Other information</td>
<td><span class="spec-table_metric">Other information metric</span><span class="spec-table_imperial">Other information imperial</span></td>
</tr>
</table>
</div>
Would it make sense if I put the entirety of my HTML table into JS variables for each variant and use getElementById(#SpecTable).innerHTML to replace it or use 6 functions to change each <span> individually by using getElementById(#IDs for Spans).textContent?
I'm confused as I may have 5 variants, which means 30 JS variables to collect and match when needed, whereas only 5 JS variables that contain the whole table.
In addition, is it better or worst if I get rid of the <div> and use outerHTML to change the table?
I'm a beginner at JS, and if you have other recommendations, I appreciate your input.

How would I write a regular expression that captures an HTML Table with a particular class?

I'm trying to write a regular express that will capture an HTML table (and all it table data) that has a particular class.
For example, the table has a recapLinks class, its comprised of numerous table rows and table data and then terminated with . See below:
<table width="100%" class="recapLinks" cellspacing="0">
[numerous table rows and data in the table.]
</td></tr></tbody></table>
I'm using javascript.
The regex to capture this is pretty simple, if you can guarantee that there are never nested tables. Nested tabled become much trickier to deal with.
/<table[^>]*class=("|')?.*?\bCLASSNAMEHERE\b.*?\1[^>]*>([\s\S]*?)</table>/im
For instance, if an attribute before class had a closing > in it, which isn't likely, but possible, the regex would fall flat on it's face. Complex reges can try to prepare for that, but it's really not worth the effort.
However, jQuery all by itself can make this a breeze, if these elements are within the DOM. Regex can be easily fooled or tripped, deliberately or accidentally but that's why we have parsers. JQuery doesn't care what's nested or not within the element. It doesn't care about quote style, multiline, any of that.
$(document).ready(function () {
console.log($("table.myClassHere").prop("outerHTML"))
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script>
<table class="myClassHere">
<tr>
<td>Book Series</td>
</tr>
<tr>
<td>Pern</td>
</tr>
<tr>
<td>Hobbit</td>
</tr>
</table>
<table class="otherClassHere">
<tr>
<td>Movies</td>
</tr>
<tr>
<td>Avengers</td>
</tr>
<tr>
<td>Matrix</td>
</tr>
</table>

How to add a note with an oddly shaped Javascript table

On my website I have a schedule optimization for semester, classes, times, and locations. After optimization is run, several tables of choices show up. I want to show an empty semester in the table. here is an example of what I mean:
I know my table looks ugly, but I can't put blanks in the table to make the columns / side complete because i'm running complex calculations on the data in the table that would get disrupted if I were to put blanks in (it would try to do look-ups on blanks). I can't tell it to ignore the box if it's a blank either (Just go with me here). So, is there a way to add a note in that area that says "No classes for this semester" programmatically? The results are often different sizes so I can't like hardcode in a location on my website for the note. I need it to just know where to go. I didn't think this was possible but wanted to pose the idea to you guys. Ideas?
This would be the end goal:
--tons of results in form of tables ---
one example result:
IF it is even possible to close in the table so it's a complete box that would be great. ****I NEED A JAVASCRIPT / JQUERY SOLUTION
UPDATED: Based on the replies so far, I tried this:
if(classes.length === 0){
var $noClasses = $('<td></td>').html('No Classes available');
$noClasses.colSpan = "3";
$table.append($noClasses);
}
and this gave me
Use rowspan and colspan to accommodate the 'awkwardness' of your table structure. The table structure is still standard, you're just wanting to span your cells across rows and/or columns:
HTML
<!DOCTYPE html />
<html>
<head>
<style>
td,th{
border-style: groove;
}
</style>
<title></title>
</head>
<body>
<table>
<tr>
<th>Title1</th>
<th>Title2</th>
<th>Title3</th>
<th>Title4</th>
</tr>
<tr>
<td rowspan="3">Semester1</td>
<td>Class1</td>
<td>Time1</td>
<td>Loc1</td>
</tr>
<tr>
<td>Class2</td>
<td>Time2</td>
<td>Loc2</td>
</tr>
<tr>
<td>Class3</td>
<td>Time3</td>
<td>Loc3</td>
</tr>
<tr>
<td>Semester2</td>
<td colspan="3">No Classes available</td>
</tr>
<tr>
<td>Semester3</td>
<td>Class1</td>
<td>Time1</td>
<td>Loc1</td>
</tr>
</table>
</body>
</html>
Result
So here is the code that ended up working:
if(classes.length === 0){
var $noClasses = $('<td colspan="3"></td>').addClass('noClass').html("No Classes ");
}
but I had to take out some of the html/css from the javascript because it was getting too confusing to implement this part. I made a template with icanhaz and converted some of the code and then this worked.

Copy and paste text from a third party web page

I was wondering if there is a way to copy and paste some part of the text from a third party web page. My boss asked me to enter a group of text (50, 100, 200) one-by-one into this website: http://fbatoolkit.com/chart_details?category=T2ZmaWNlIFByb2R1Y3Rz&rank=500 and copy/paste the information "3 (30 days avg)" into another file. The "rank=500" part is the query string in the url. And I also know where the info, in the html source code, is. It is here:
<div style="margin: 20px">
Estimate sales per day for the rank
<input type="text" name="rank" value="500" />
in this category.
<input type="submit" value="Estimate" />
<table width="200">
<tr>
<td>
3 (30 Days Avg)
</d>
</tr>
<tr>
<td>
More than 2 (Last Day)
</td>
</tr>
</table>
</div>
</form>
I was wondering if there is a way to recursively access the website and copy/paste that part of text into another file. I know it is probably not the smartest way to do things but please help, the almighty stack overflow! I really appreciate that.
So I don't write python but I'll give it a shot. These types of tasks are usually very easy to accomplish with Python. So, I'll give you the general language constructs that I would use complete with links to accomplish this.
General Steps
Set up array of categories
Set up array of ranks to use
For loop through each category and then nested loop through each rank
Within this inner loop, query the web page like this: see This Answer for more options to opening and reading URLS
page = urllib.request.urlopen("URL HERE").read()
Then use RegEx to find the text you're interested in, by doing something like this (Note - the below RegEx was created assuming "(30 Days Avg)" was a static string, which it seemed like from page you supplied. You can re-append this text to the end of the grouped item if you'd like):
match = re.search("(\w+) (30 Days Avg)$", string)
extractedText = match.groups(0)
Append text to file of your choice per This Answer
Close out your loops
Sorry this wasn't more cut-and-paste code. Also the SO text editing syntax doesn't seem to handle code inside lists very well. "extractedText... " should be on its own line.

Add selected row from a table to another table with Jquery and MVC3

I am using MVC 3, EF Model First on my project.
In my View I have 4 tables that look likes these.
<div class="questionsForSubjectType">
<table>
<thead>
<tr>
<th>
Title
</th>
</tr>
</thead>
<tbody>
<tr>
<td>
test
</td>
</tr>
</tbody>
</table>
</div>
users should be able to select and add to another table lets say the table is following:
<table id="CustomPickedQuestions>
/* <----- Questions that users chose from the other tables /*
</table>
What I am looking for is that when a users click on a row, the row shall get removed and added to the CustomPickedQuestions, When the row is added to that table, the user should also be able to remove it from CustomPickedQuestions Table and then that row shall go back to the Table it was before.
I now wonder how I can accomplish this with help of client-side jquery scripting.
You've got far too much irrelevant complexity in you code (for the specific question you ask). The title is good, but not the code. Rather than posting your complex project code, create the simplest possible reproduction of the problem using the least amount of code/methods/properties (with common names, that is ProductID, ProductName, etc). part 3 of my tutorial shows how to do this. See http://www.asp.net/mvc/tutorials/javascript/working-with-the-dropdownlist-box-and-jquery/adding-a-new-category-to-the-dropdownlist-using-jquery-ui

Categories

Resources