How to do page numbering in header/footer htmls with wkhtmltopdf? - javascript

I'm developing an electronic invoicing system, and one of our features is generating PDFs of the invoices, and mailing them. We have multiple templates for invoices, and will create more later, so we decided to use HTML templates, generate HTML document, and then convert it to PDF. But we're facing a problem with wkhtmltopdf, that as far as I know (I've been Googleing for days to find the solution) we cannot simply both use HTML as header/footer, and show page numbers in them.
In a bug report (or such) ( http://code.google.com/p/wkhtmltopdf/issues/detail?id=140 ) I read that with JavaScript it is achievable this combo. But no other information on how to do it can be found on this page, or elsewhere.
It is, of course not so important to force using JavaScript, if with wkhtmltopdf some CSS magic could work, it would be just as awesome, as any other hackish solutions.
Thanks!

Actually it's much simpler than with the code snippet. You can add the following argument on the command line: --footer-center [page]/[topage].
Like richard mentioned, further variables are in the Footers and Headers section of the documentation.

Among a few other parameters, the page number and total page number are passed to the footer HTML as query params, as outlined in the official docs:
... the [page number] arguments are sent to the header/footer html documents in GET fashion.
Source: http://wkhtmltopdf.org/usage/wkhtmltopdf.txt
So the solution is to retrieve these parameters using a bit of JS and rendering them into the HTML template. Here is a complete working example of a footer HTML:
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<script>
function substitutePdfVariables() {
function getParameterByName(name) {
var match = RegExp('[?&]' + name + '=([^&]*)').exec(window.location.search);
return match && decodeURIComponent(match[1].replace(/\+/g, ' '));
}
function substitute(name) {
var value = getParameterByName(name);
var elements = document.getElementsByClassName(name);
for (var i = 0; elements && i < elements.length; i++) {
elements[i].textContent = value;
}
}
['frompage', 'topage', 'page', 'webpage', 'section', 'subsection', 'subsubsection']
.forEach(function(param) {
substitute(param);
});
}
</script>
</head>
<body onload="substitutePdfVariables()">
<p>Page <span class="page"></span> of <span class="topage"></span></p>
</body>
</html>
substitutePdfVariables() is called in body onload. We then get each supported variable from the query string and replace the content in all elements with a matching class name.

To show the page number and total pages you can use this javascript snippet in your footer or header code:
var pdfInfo = {};
var x = document.location.search.substring(1).split('&');
for (var i in x) { var z = x[i].split('=',2); pdfInfo[z[0]] = unescape(z[1]); }
function getPdfInfo() {
var page = pdfInfo.page || 1;
var pageCount = pdfInfo.topage || 1;
document.getElementById('pdfkit_page_current').textContent = page;
document.getElementById('pdfkit_page_count').textContent = pageCount;
}
And call getPdfInfo with page onload
Of course pdfkit_page_current and pdfkit_page_count will be the two elements that show the numbers.
Snippet taken from here

From the wkhtmltopdf documentation (http://madalgo.au.dk/~jakobt/wkhtmltoxdoc/wkhtmltopdf-0.9.9-doc.html) under the heading "Footers and Headers" there is a code snippet to achieve page numbering:
<html><head><script>
function subst() {
var vars={};
var x=document.location.search.substring(1).split('&');
for(var i in x) {var z=x[i].split('=',2);vars[z[0]] = unescape(z[1]);}
var x=['frompage','topage','page','webpage','section','subsection','subsubsection'];
for(var i in x) {
var y = document.getElementsByClassName(x[i]);
for(var j=0; j<y.length; ++j) y[j].textContent = vars[x[i]];
}
}
</script></head><body style="border:0; margin: 0;" onload="subst()">
<table style="border-bottom: 1px solid black; width: 100%">
<tr>
<td class="section"></td>
<td style="text-align:right">
Page <span class="page"></span> of <span class="topage"></span>
</td>
</tr>
</table>
</body></html>
There are also more available variables which can be substituted other than page numbers for use in Headers/Footers.

Safe approach, even if you are using XHTML (for example, with thymeleaf). The only difference with other's solution is the use of // tags.
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8"/>
<script>
/*<![CDATA[*/
function subst() {
var vars = {};
var query_strings_from_url = document.location.search.substring(1).split('&');
for (var query_string in query_strings_from_url) {
if (query_strings_from_url.hasOwnProperty(query_string)) {
var temp_var = query_strings_from_url[query_string].split('=', 2);
vars[temp_var[0]] = decodeURI(temp_var[1]);
}
}
var css_selector_classes = ['page', 'topage'];
for (var css_class in css_selector_classes) {
if (css_selector_classes.hasOwnProperty(css_class)) {
var element = document.getElementsByClassName(css_selector_classes[css_class]);
for (var j = 0; j < element.length; ++j) {
element[j].textContent = vars[css_selector_classes[css_class]];
}
}
}
}
/*]]>*/
</script>
</head>
<body onload="subst()">
<div class="page-counter">Page <span class="page"></span> of <span class="topage"></span></div>
</body>
Last note: if using thymeleaf, replace <script> with <script th:inline="javascript">.

My example shows how to hide some text on a particular page, for this case it shows the text from page 2 onwards
<span id='pageNumber'>{#pageNum}</span>
<span id='pageNumber2' style="float:right; font-size: 10pt; font-family: 'Myriad ProM', MyriadPro;"><strong>${siniestro.numeroReclamo}</strong></span>
<script>
var elem = document.getElementById('pageNumber');
document.getElementById("pageNumber").style.display = "none";
if (parseInt(elem.innerHTML) <= 1) {
elem.style.display = 'none';
document.getElementById("pageNumber2").style.display = "none";
}
</script>

Right From the wkhtmltopdf Docs
Updated for 0.12.6.
Footers And Headers:
Headers and footers can be added to the
document by the --header-* and --footer* arguments respectively. In
header and footer text string supplied to e.g. --header-left, the
following variables will be substituted.
[page] Replaced by the number of the pages currently being printed
[frompage] Replaced by the number of the first page to be printed
[topage] Replaced by the number of the last page to be printed
[webpage] Replaced by the URL of the page being printed
[section] Replaced by the name of the current section
[subsection] Replaced by the name of the current subsection
[date] Replaced by the current date in system local format
[isodate] Replaced by the current date in ISO 8601 extended format
[time] Replaced by the current time in system local format
[title] Replaced by the title of the of the current page object
[doctitle] Replaced by the title of the output document
[sitepage] Replaced by the number of the page in the current site being converted
[sitepages] Replaced by the number of pages in the current site being converted
As an example specifying --header-right "Page [page] of [topage]", will result in the text "Page x of y" where x is the
number of the current page and y is the number of the last page, to
appear in the upper left corner in the document.
Headers and footers can also be supplied with HTML documents. As an
example one could specify --header-html header.html, and use the
following content in header.html:
<!DOCTYPE html>
<html>
<head><script>
function subst() {
var vars = {};
var query_strings_from_url = document.location.search.substring(1).split('&');
for (var query_string in query_strings_from_url) {
if (query_strings_from_url.hasOwnProperty(query_string)) {
var temp_var = query_strings_from_url[query_string].split('=', 2);
vars[temp_var[0]] = decodeURI(temp_var[1]);
}
}
var css_selector_classes = ['page', 'frompage', 'topage', 'webpage', 'section', 'subsection', 'date', 'isodate', 'time', 'title', 'doctitle', 'sitepage', 'sitepages'];
for (var css_class in css_selector_classes) {
if (css_selector_classes.hasOwnProperty(css_class)) {
var element = document.getElementsByClassName(css_selector_classes[css_class]);
for (var j = 0; j < element.length; ++j) {
element[j].textContent = vars[css_selector_classes[css_class]];
}
}
}
}
</script></head>
<body style="border:0; margin: 0;" onload="subst()">
<table style="border-bottom: 1px solid black; width: 100%">
<tr>
<td class="section"></td>
<td style="text-align:right">
Page <span class="page"></span> of <span class="topage"></span>
</td>
</tr>
</table>
</body>
</html>
ProTip
If you are not using certain information like the webpage, section, subsection, subsubsection, then you should remove them. We are generating fairly large PDFs and were running into a segmentation fault at ~1,000 pages.
After a thorough investigation, it came down to removing those unused variables. No we can generate 7,000+ page PDFs without seeing the Segmentation Fault.

I have not understood the command line en finally I find the solution to put this information directly in the controller without any JS en command line.
In my controller when I call the format.pdf I just put the line footer:
format.pdf do
render :pdf => "show",
page_size: 'A4',
layouts: "pdf.html",
encoding: "UTF-8",
footer: {
right: "[page]/[topage]",
center: "Qmaker",
},
margin: { top:15,
bottom: 15,
left: 10,
right: 10}
end

The way it SHOULD be done (that is, if wkhtmltopdf supported it) would be using proper CSS Paged Media: http://www.w3.org/TR/css3-gcpm/
I'm looking into what it will take now.

Related

Return Sheets data based on URL Parameter in Google Apps Script / Web App

This is a piece of a larger project. Essentially, I'm going to have a link with a parameter ("formid" in this case) that I need to use to retrieve the correct row number from a Google sheets table. The code below works the way I want it to with the exception of the parameters not being used and the rows being retrieved are hard coded. I'd like to change this so the getBody row corresponds to formid number (ie.: if formid=4 then the 4th row would be displayed. Column positions can be hardcoded, I only need the one variable to be used.
Index.html:
<!DOCTYPE html>
<html>
<head>
</head>
<style>
</style>
<body>
<form>
<h1><center>Sheet</center></h1>
</form>
<script>
google.script.url.getLocation(function(location) {
document.getElementById("formid").value = location.parameters.formid[0];
});
</script>
<div>
<table>
<thead>
<? const headers = getHeaders();
for (let i = 0; i < headers.length; i++) { ?>
<tr>
<? for (let j = 0; j < headers[0].length; j++) { ?>
<th><?= headers[i][j] ?></th>
<? } ?>
<? } ?>
</thead>
<tbody>
<? const body = getBody();
for (let k = 0; k < body.length; k++) { ?>
<tr>
<? for (let l = 0; l < body[0].length; l++) { ?>
<td><?= body[k][l] ?></td>
<? } ?>
<? } ?>
</tbody>
</table>
</div>
</body>
</html>
Code.gs:
function doGet() {
return HtmlService
.createTemplateFromFile('Index')
.evaluate();
}
function getHeaders() {
var url = "https://docs.google.com/spreadsheets/d/1NnG5lEKowlU6i2ZzkyCD1bjFtFGcgaODKZxvG179XfM/";
const sheet = SpreadsheetApp.openByUrl(url).getSheetByName("Sheet1");
return sheet.getRange(1, 1, 1, sheet.getLastColumn()).getDisplayValues();
}
function getBody() {
var url = "https://docs.google.com/spreadsheets/d/1NnG5lEKowlU6i2ZzkyCD1bjFtFGcgaODKZxvG179XfM/";
const sheet = SpreadsheetApp.openByUrl(url).getSheetByName("Sheet1");
const firstRow = 8;
const numRows = 1;
return sheet.getRange(firstRow, 1, numRows, sheet.getLastColumn()).getDisplayValues();
}
I've reviewed many related questions but either the solutions didn't seem to work or it wasn't clear what the full solution was. Perhaps I missed something.
I've tried inserting "formid" into the "return sheet.getRange()" but I keep getting an error that formid isn't an int.
I've made several attempts at this and the code above represents the closest and simplest script that has gotten me most of the way there.
I believe your goal is as follows.
You want to show only a row from the Spreadsheet by giving the row number using the query parameter like formid=5.
Modification points:
In your script, it seems that you are retrieving the query parameter on the HTML side. In this case, when the template is used, after the HTML is loaded, google.script.url.getLocation is run. So, in this answer, I would like to propose retrieving the query parameter on the Google Apps Script side.
In the current stage, when the loop process is included in the HTML template, the process cost becomes high. Ref (Author: me)
When these points are reflected in your script, how about the following modification?
HTML side:
In your HTML, document.getElementById("formid") is not found. So, in this modification, only HTML template without the script is used.
<!DOCTYPE html>
<html>
<head>
</head>
<style>
</style>
<body>
<form>
<h1>
<center>Sheet</center>
</h1>
</form>
<div>
<table>
<?!= table ?>
</table>
</div>
</body>
</html>
Google Apps Script side:
In this modification, only doGet function is used as follows.
function doGet(e) {
const url = "https://docs.google.com/spreadsheets/d/1NnG5lEKowlU6i2ZzkyCD1bjFtFGcgaODKZxvG179XfM/";
const { formid } = e.parameter;
const sheet = SpreadsheetApp.openByUrl(url).getSheetByName("Sheet1");
const [header, ...values] = sheet.getDataRange().getValues();
const h = "<thead><tr>" + header.map(e => `<th>${e}</th>`).join("") + "</tr></thead>";
const b = values[formid - 1] ? "<tbody><tr>" + values[formid - 1].map(e => `<td>${e}</td>`).join("") + "</tr></tbody>" : "";
const html = HtmlService.createTemplateFromFile('i16');
html.table = h + b;
return html.evaluate();
}
In this modification, for example, when a Web Apps URL like https://script.google.com/macros/s/###/dev?formid=5 including the query parameter is used, formid=5 is retrieved in the doGet function, and only the row of formid=5 is shown.
Note:
When you modified the Google Apps Script of Web Apps, please modify the deployment as a new version. By this, the modified script is reflected in Web Apps. Please be careful about this.
You can see the detail of this in my report "Redeploying Web Apps without Changing URL of Web Apps for new IDE (Author: me)".

Grease/tamper monkey script to look for specific HTML in page and alert

I would like a grease/tamper monkey script, that when I visit a page, it looks for the following HTML on the page, and if it is present alert.
<p>
<script
type='text/javascript'
src='https://site_im_visiting.com/?a(6 hex characters)=(numbers)'>
</script>
</p>
Additionally, I would like to look inside an array (of about 4k sites), to see if the site is in the array.
Figured it out on my own, but instead of looking for the HTML, looked for the URL of the src of the script block
var scripts = document.getElementsByTagName("script")
var expression = /.*\/\?a[a-f0-9]{6}\=\d+?/gi;
const domains =["something.com","google.com"]
for (var i = 0; i < scripts.length; ++i) {
var regex = new RegExp(expression)
var t = scripts[i].src
if (t.match(regex)) {
const url = new URL(t)
if(domains.includes(url.hostname)){
console.log(url.hostname)
}else{
alert(url.hostname)
}
}
}

Can't append table to div

I created a table using d3.js library,
but when I try to append the table to a div, it gives an error?
code:
<head>
<script src="../../d3.min.js"></script>
</head>
<body>
<div id="main">
Hi
</div>
<script>
const table = d3.create("table");
const tbody = table.append("tbody");
var i,j,row;
for(i=0;i<5;i++){
row =tbody.append("tr");
for(j=0;j<3;j++){
row.append("td").text(`${i},${j}`);
}
}
console.log(typeof(table));
console.log(table);
node =table.node();
console.log(typeof(node));
console.log(node);
d3.select("#main").append(node);
</script>
</body>
</html>
but I get an error:
although my code similar to what is in this tutorial
A tutorial on d3js
Observable tutorials are meant to create Observable notebooks. There are several small differences between Observable and a regular D3 running in a browser.
That being said, the only problem in your approach is that append requires either a string with the tag name or the element. If you have a string, just use it as append("foo"). However, if you have the element to be appended (in your case, table.node()), you have to return it from a function.
So, instead of:
d3.select("#main").append(node);
It has to be:
d3.select("#main").append(() => node);
Here is your code with that change only:
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/5.7.0/d3.min.js"></script>
<div id="main">
Hi
</div>
<script>
const table = d3.create("table");
const tbody = table.append("tbody");
var i, j, row;
for (i = 0; i < 5; i++) {
row = tbody.append("tr");
for (j = 0; j < 3; j++) {
row.append("td").text(`${i},${j}`);
}
}
node = table.node();
d3.select("#main").append(() => node);
</script>
Finally, if you are writing regular scripts for a browser, just ditch this d3.create() followed by append(() => selection.node()). Use a simple tag string instead.

HTML Template displaying an extra template item when added via Javascript

I have an HTML template that I populate using Javascript when the on-load event is called and for some reason, the base template itself displays despite not being part of the array I loop through and I can't figure out why. From my understanding templates only show when appended to the document body, but the base template shows in addition to the appended templates (minus the last one since the base template hogs the first spot) and I can verify this by changing data in the base template.
I cannot use a jQuery solution as this is going to be read into an iOS web view through a function that only accepts pure Javascript.
My code is as follows, if anyone can explain why the initial template is showing itself I would greatly appreciate it. I've scoured for a solution and haven't found anything and think maybe I'm misunderstanding how the templating works.
<script>
var titles = ["Item1", "Item2", "Item3", "Item4", "Item5"];
function addGalleryItem() {
var template = document.querySelector("#galleryTemplate");
var label = template.content.querySelector(".caption");
var node;
for (var i = 0; i < titles.length; i++) {
node = document.importNode(template.content, true);
label.innerHTML = titles[i];
document.body.appendChild(node);
}
}
</script>
<html>
<body onload="addGalleryItem()">
<template id="galleryTemplate">
<div class="galleryItem">
<img class="galleryImage" src="img.png" alt="Unknown-1" width="275" height="183">
<div class="caption">
<label></label>
</div>
</div>
</template>
</body>
</html>
I think, the only thing wrong with your code is you are assigning innerhtml to the 'label' variable after 'node' variable, so the right code is
for (var i = 0; i < titles.length; i++) {
label.innerHTML = titles[i];
node = document.importNode(template.content, true);
document.body.appendChild(node);
}
template is not rendering, your loop is running 5 times as it has 5 items.
is it as simple as :
<!DOCTYPE html>
but I also think you are querying your label in the wrong place:
for (var i = 0; i < titles.length; i++) {
node = document.importNode(template.content, true);
label = node.querySelector(".caption"); // should query here ?
label.innerHTML = titles[i];
document.body.appendChild(node);
}

Can I include external HTML files as headers/footers when rendering to PDF with PhantomJS?

I'm trying to set up a way to render HTML files to PDF. The HTML files are dynamically generated, and have separate HTML files for their headers and footers, which need to be attached to each page.
Am I able to get PhantomJS pick up these separate files and display them in the header and footer sections? I can get headers and footers when I have the HTML for them in the javascript files I pass to PhantomJS, as in the example code I've included, but I'm not familiar with HTML, CSS, Javascript (and Node?) enough to know if there is a clear way to do what I want.
This is what I have at the moment:
var page = require('webpage').create();
//set the page size and add headers & footers
page.paperSize = {
format: 'A4',
margin: '.5cm',
header: {
height: "1cm",
contents: phantom.callback(function(pageNum, numPages) {
return "<h1 style='font-size:12.0; font-family:Arial,Helvetica,FreeSans,sans-serif'>Header <span style='float:right'>" + pageNum + " / " + numPages + "</span></h1>";
})
},
footer: {
height: "1cm",
contents: phantom.callback(function(pageNum, numPages) {
return "<h1 style='font-size:12.0; font-family:Arial,Helvetica,FreeSans,sans-serif'>Footer <span style='float:right'>" + pageNum + " / " + numPages + "</span></h1>";
})
}
}
//open the page
page.open('tools.html', function() {
page.render('woki2.pdf');
phantom.exit();
});
It works to add the headers and footers but requires inline styling (which I believe I can get around with these methods) but I would prefer to be able to pick up external HTML files, so that a) different headers and footers can be used depending on the type of file being converted; and b) so that the HTML files can be edited easily without changing the JS files.
This is an example of the header HTML file:
<!DOCTYPE html>
<html>
<head>
<script>
function subst() {
var vars = {};
var x = window.location.search.substring(1).split('&');
for (var i in x) {
var z = x[i].split('=', 2);
vars[z[0]] = unescape(z[1]);
}
var x = ['frompage', 'topage', 'page', 'webpage', 'section', 'subsection', 'subsubsection', 'date', 'time'];
for (var i in x) {
var y = document.getElementsByClassName(x[i]);
for (var j = 0; j < y.length; ++j) y[j].textContent = vars[x[i]];
}
}
</script>
</head>
<body style="border:0; margin: 0;" onload="subst()">
<table style="width: 100%">
<tr>
<td>Generated by *person* on <span class="date"></span> <span class="time"></span></td>
<td style="text-align:right">
Page <span class="page"></span> of <span class="topage"></span>
</td>
</tr>
</table>
</body>
</html>
While I am aware I can achieve a similar thing to what is represented there with the first example, I'd prefer to have the files separated if possible.
There is simple way , simply read header.html file assign it to one variable and use it in header part
var fs = require('fs');
var headerContent = fs.read('header.html');
header: {height: "1cm",contents: headerContent}

Categories

Resources