PhantomJS cutting off content in header/footer with PDFs

PhantomJS cutting off content in header/footer with PDFs - javascript

I'm using PhantomJS to dynamically generate a PDF. However it seems to always cut off some content in between the footer and header.
Relevant files:
HTML content: http://bit.ly/146Ljdp
rasterize.js file: http://bit.ly/12iSC7y
Example PDF file: http://bit.ly/10Aa309
As you can see in the PDF file, the content is cut off between page 5 and 6.
I've looked into the existing Qt and PhantomJS bugs, and am not completely sure on whether I'm not doing something right, or it's simply a bug.
AFAIK the most ideal option is to use page-break-inside:avoid, however Qt doesn't seem to support that yet.
The code content inside the PDF/document uses tables, and apparently there's been issues with that in the past: 927, 989, 1038, 880. I tried removing all of the table elements (table, tr, td, etc) and replaced them with divs. It looked exactly the same in the browser and PDF file, but it was still being cut off.
I tried a javascript hack to check each element if the top position and bottom position were on different pages, and if so, add a page-break-before... however I couldn't properly get the individual page size, to check the relative position from the header on each page. $('element').offset().top returns the position from the top of the entire document, not the current page.
Any ideas on what I'm doing wrong?

The is a tricky, But i solved it using simple logic.
Cast element to display as a Block element and it works.
display:block;
page-break-inside:avoid;

Related

How can I use anchor links pointing within an iframe?

I have a html document loaded in an iframe on a website.
The document has a table of contents and clicking on any of the links jumps to the appropriate part of the document.
Navigation is supposed to work from a sidebar that is specified by a separate XML.
Adding a link to said XML displays the HTML in the iframe:
href="source_folder/file.html"
Issue is, when I try to add a link to a specific section, like href="source_folder/file.html#_Toc0123" it just jumps back to the top of the HTML.
In the usual use-case, the sections are all separate HTML files, and get linked in the corresponding XML. Issue being I don't want to go through the hassle of separating multiple large files into individual HTMLs.
Any idea on what I'm missing? Or is this simply not possible?
(I didn't build the original site, but if there is an attribute that governs it, feel free to let me know where to look for it)
Thanks!

CSS & Javascript, Need Hide Auto-Generated HTML By Default

I am writing a free online e-book which needs a few minor formatting tweaks:
http://rperl.org/learning_rperl.html
The "Full Table Of Contents" at the very top of the page starts out by being visible for a few seconds, then finally collapses itself to be hidden. What we need is for it to start as hidden, and not be visible at all for the several seconds while the page loads. You can see that I have already tried to solve this issue by setting "var index_hidden=1;" at the following link, otherwise the table of contents would never hide itself at all:
https://github.com/wbraswell/rperl/blob/gh-pages/javascripts/metacpan_rperl.js#L832-L833
It probably shouldn't matter, but I'm using some custom Perl scripts to generate this file from Perl POD source, I can give more info if needed.

Although the described behavior does not appear for me (OSX + Firefox). Here's what you might do:
Hide the element by default using CSS. Add this to your head element (extend with stronger hiding CSS when needed).
<style>.wait-for-js { display: none; }</style>
And hide your element by adding the class
<div id="index-container" class="hide-index wait-for-js">
Last but not least, to make this trick functional. Remove the class as soon as JS is loaded, which would also mean that other logic has been loaded and you're save to show the table of contents. Be sure to load this JavaScript last thing you'll do.
<script>
document.getElementById('index-container').className = 'hide-index';
</script>
Or if you're using jQuery
<script>$('.wait-for-js').removeClass('wait-for-js');</script>
Welcome to SO!

Printing iframe shows only a small viewport

I'm using an iframe to prevent the CSS of user-provided content (converted .doc files) from bleeding out of an AJAX panel in my single-page application. It's the last item displayed, usually about 5 pages following 1 page.
It works visually in the screen application, where I have a parent-relative height and width to leverage, but I realized I couldn't actually get the iframe to print its entire contents with #media print declarations.
I've seen about 5 related questions answered 'not possible', but I'm more flexible than those askers. I'm willing to:
use javascript if it can format the print view without affecting the screen view. can you point me to an example?
I currently directly insert the iframe's html, so I also have a lot of control over its content.
to print the parent and iframe contents separately, if they'll queue seamlessly in major browsers' print managers - especially if there's a way to handle browser print events in this case?
as a last resort, I will probably print ONLY the iframe content, using the following. still would like to handle print events if it's possible cross-browser.
javascript:
window.frames["OverlayDis"].focus();
window.frames["OverlayDis"].print();

Extracting the header and the footer of html page

How to extract the header and the footer from this page to insert it in another page?
I'm a bit confused because when I copy and paste the header div it never has the same structure and graphics in the new page? Am I missing something here?

Typically when you copy/paste the div you're getting the HTML, but not the CSS styling (unless the styling is in-line).
There is a Chrome extension called "CSS + HTML" that allows you to, in the developer console, generate a version of the div that has all CSS turned into in-line CSS, so that you can copy/paste a pretty accurate version.
(Caveats: I've had some issues with the extension, so I don't enable it except when I need it, and the HTML produced is a) awful, because it has lots of unnecessary inline CSS, and b) not always a precise match. But it's pretty good.)

Yes, you are missing something. The CSS, images, and links... They are using relative links. You would need to be sure to replace those links.
The images are linked relatively so unless you copy them local you will not have access to them.
You would also need the Style Sheets as they are linked relatively in the head.
Not that the links in some cases are to .php files. Unless you know the php running in the background you are going to lose that functionality too.

Modifying HTML while document is loading

Assume you have 75 div's that have data-id="{some number}" attribute. The overall page size is unfortunately big, very big.
There are many repetitive HTML snippets in my HTML document like image tags or links. These images/links' only changing portion is the id.
The HTML document is quite long, these snippets contribute to the overall size of the document.
I can run a javascript when dom is ready, but the user experience will be:
- wait the page loads, and start seeing nodes etc,
- page loads,
- extra snippets show.
I can make the top container DIV to hide until the page loads but
- worried that google search bot could realize the div is hidden and skip the content (or does it?)
- the users won't be able to see the content while the page is loading.
What ideal is to load the page in HTML without much extra markup for google search bot, and add extra elements while it's loading with javascript.
Any tricks that I can try that comes to your mind to accomplish this?
Thank you.

The best performance and user experience is to do as much work as possible on the server, then send efficient HTML and allow the browser to display the page as it's received. Sending say a single DIV container, then using script to clone it 70 or 80 times will be slower (probably a lot slower for some users).
Hiding content completely until your script has finished is the worst solution - users are left with a blank (or minimal content) page, waiting for something to happen.
The vast bulk of most pages is script and images, replacing HTML with scripting really is playing at the margins. e.g. this page has 90KB of HTML and 264KB of script, images and css. Apple's home page has 12KB of HTML and around 800KB of script, css and images.
Browsers show content progressively as it's received because that's how they evolved over many years on the web. Users prefer to see something rather than nothing, and to start viewing content while the rest loads (it's all about the content, not about fancy layouts or effects). Try to work with browser behaviour and features rather than against them.
You can greatly help the browser by specifying sizes for images and having an efficient layout. That way the layout won't change much as new content is received.

Depending on other page content, you could run your script on DocumentReady as opposed to onload.
DocumentReady runs after the page downloaded and the DOM rendered, but before images are retrieved.
I believe that there is an official DocumentReady event somewhere, but I still have to support IE6 on my pages, so I use a busy loop to watch the DOM.

Develop Reference

JavaScript is the programming language of the Web.

PhantomJS cutting off content in header/footer with PDFs - javascript

The is a tricky, But i solved it using simple logic. Cast element to display as a Block element and it works. display:block; page-break-inside:avoid;

Related

How can I use anchor links pointing within an iframe?

CSS & Javascript, Need Hide Auto-Generated HTML By Default

Printing iframe shows only a small viewport

Extracting the header and the footer of html page

Modifying HTML while document is loading

Categories

Resources