Creating a printable/downloadable PDF of a web application

Creating a printable/downloadable PDF of a web application - javascript

I have been searching for an answer to this problem now for several weeks. I also previously tried to research this a few years ago to no avail.
Problem Summary:
My company has developed a web-based data analytics suite for a major beverage distributor. They have recently asked for a feature that allows the user to print or download a visually pleasing version of the rendered app as a PDF. I have had no luck in finding a solid, controllable, or reliable method to do this. I was hoping the stack community might be able to point me in the right direction.
Current Tech Stack:
Plack servers
Perl base on the Dancer framework
Standard web dev front-ends: HTML5, CSS3, Javascript, Jquery/UI
Client is using IE9/10 and Chrome.
Attempted Solutions Summary:
Obviously I started with the window.print() and tried to control what printed using classes and a specialized print.css but the output was still awful.
I looked in to pdfmachine and pdfbox and even contacted Adobe's acrobat development team directly to see if they had an out of the box solution our company could purchase. I was informed that such a product would be counter intuitive to their desired business model of putting an acrobat subscription on each client computer rather than a single server side application.
I have extensively searched the stack articles but did not feel that the articles I found covered what I was looking for.
At present, I am all out of ideas and am hoping somebody out there has had better luck at this than I have.
tl;dr = I need a pdf version of the rendered output of a complex reporting app.
Thanks for your time stack, I appreciate it.

A solution I have used in the past is to use PhantomJS running on a server to generate the PDF for download/email. Usually if the content is sensitive the server (that handles authentication) would provide a single use viewing token that is then passed to a PhantomJS process. It loads the URL with the viewing token then saves as a PDF.
Further info on Phantoms screen cap API can be found here on GitHub.
https://github.com/ariya/phantomjs/wiki/Screen-Capture

Is it something you can create in Perl using PDF::API2 or PDF::Create? You can load and modify and existing PDF (handy if you want standard headers and footers), and then insert the relevant content. The learning curve can be a bit steep, but simple reports should be easy enough.
See PDF::TextBlock and PDF::Table too - they are great little helpers.

Consider this service http://pdfmyurl.com/ . I try to use many perl modules, but they dont satisfy my problems.

Related

R & Javascript callbacks

I'm writing a UI to my R script, which asks the user some names of organisms and the location of a folder, using javascript/html that will be local (not hosted, ever).
At the moment, I have just that: a couple of text boxes that take input and pass an executable R script. Originally this UI was being written as a very user friendly option, but slowly I've realized that some nifty tricks can be added such as a textbox that completes the word for the user (so if the user misspells the name of the organism, the UI will correct the input based on the files uploaded. And this would come from a list of organisms text file that R would generate immediately once the files have been added).
Is there a way to make this more efficient? For example, retrieving plots from R (as .pngs) and updating my local webpage and being able to share a log file between R and the UI (mind you, I am aware of the potential File I/O errors)..but for the sake of brainstorming.
I'm aware of Shiny, but what I would like is a simple local UI, as I will be dealing with big data (average ~ 1 gigabyte worth of files that my script will process).
Another way to ask my question that is more to the point:
Here's an example of integrating PHP and R: http://www.r-bloggers.com/integrating-php-and-r/
I am looking to create something similar with javascript/css/html/jquery etc.
Thanks

You could definitely use nodejs (nodejs.org) for that. Take a look at https://github.com/elijah/r-node and r-node. Confusingly enough, this is two different projects with the same name. More info on the latter here: squirelove.net/r-node/doku.php
In recent years JavaScript has become one of the fastest programming languages. In one case I know of, JavaScript is faster than C++. See: benchmarksgame.alioth.debian.org/u32/performance.php?test=regexdna
Bear in mind, though, that memory is very difficult to manage in JavaScript, so you should run some sort of memory leak detection program on your code, if you plan to create long running processes.
E.I: memwatch (npmjs.org/package/memwatch) or nodeheap (npmjs.org/package/memwatch)
Good luck with your endeavors!
PS. sorry for the lack of real links. I'm apparently not allowed to post more than 2 links.

Why wouldn't you be able to use Shiny locally? You design your app on your computer and run it locally with runApp('myapp') from an R-prompt. Unless you are experienced with javascript I would give shiny another look: http://www.rstudio.com/shiny/
The example you linked to can be very easily implemented using Shiny. See link below for a tutorial on how to write the app:
http://rstudio.github.com/shiny/tutorial/#hello-shiny
To run that example locally:
install.packages('shiny')
shiny::runExample('01_hello')

I have a similar case, and shiny looked like a good idea to me. However, after I did a few first steps, I am no longer sure about this. Note that most of the examples use shiny to display results. When you get into editing some fields and using a database, things can become messy; the reactive-ness gets in the way once fields can be change by program and by the user.
As an example see https://gist.github.com/dmenne/4721235/edit. The main problem for the current state of shiny is that you must use the dynamic UI for this type of work, which kills any separation of ui and server because you have to create the ui elements in the server.
shiny is a great idea, but for anything larger with interaction it is too early now. Knowing that the amazing RStudio team is behind it, I am sure the stress should be on now.
What else is there around to make user interfaces for R? TclTk makes me shudder. I working in c# a lot, and I had been using R(D)COM for interfacing some years ago, but gave up after installation and licensing problems. There is R.DOTNet which works better now; it is the most hazzle-free installation-wise, but it is not a very active project, and tends to crash. Interfacing via RServe/RServeCLI is stable, but is too difficult to install on Windows, for example on hospital computers with their strict security issues.
And there is Qt. With the active RInside community, it would be a good choice and the interface is great. I wish however my programming skills were at the level of the RStudio-guys. The fact that even Dirk is one the proof-of-concept level (using rinside with qt in windows) is not encouraging.

Scraping dynamically generated html inside Android app

I am currently writing an Android app that, among other things, uses text information from websites which I do not own. In addition, some of the pages require authentification.
For some pages I have been able to log in and retrieve the html code using BasicNameValuePairs and an HTTPClient with its associated objects.
Unfortunately, these methods retrieve the webpage source without running any javascript functions that a browser (Android Webview even) would normally run. I need the text that some of these scripts are retrieving.
I've done my research, but everything I've found is guesswork & extremely confusing. I'm okay with ignoring pages that require login for now. Also, I am willing to post any code that may be useful for constructing a solution; It is an independent project.
Any concrete solutions for scraping the html result from javascript calls? An example would be absolutely top-notch.

Final Success:
Rhino. Used this jar file.
Other Things I Tried:
HttpClient provided by Android
Cannot run javascript
HtmlUnit
4 hours, no success. Also huge, added 12 mb to my apk.
SL4A
Finally compiled. Used THIS guide to set-up. Abandoned as overkill for a simple rhino jar.
Things That Might Work:
Selenium
Further results will be posted. Others results will be added if posted.
Note: many of the options listed above reference each other. I think rhino is included in both sl4a and htmlunit. Also, I think htmlunit contains selenium.

The aforementioned solutions are very slow and restrict you to 1 url (well, not really, but I dare you to scrape 10 urls with Rhino while your user is impatiently waiting for results).
An alternative is to use a cloud scraping solution. You get the benefit of not wasting phone bandwidth on downloading content you won't use.
Try this solution: Bobik Java SDK
It gives you the ability to scrape up to hundreds of sites in a matter of seconds

Is it possible to build a web-office application with JavaScript and OpenOffice?

I want to write a web site which can edit OpenOffice document ODF, user can upload ODF file to website, edit them, and download them as ODF again.
How can I do this? And how does docs.google.com do it?
Now I want to try OpenOffice (LibreOffice) UNO programing at server and JavaScript on website, is it possible?
If it is possible, how can I do it?

In general. Yes you can.
Open office is 'open', thus you can by browsing it's code you can learn how ODF's are created and kept. But you should find specification of all open office documents, it's out there, and it's a reason why all open offices documents have been granted ISO standard a while ago.
I think there should be a plenty of already scripts that are converting doc to odf, pdf to odf, backwards and etc, try googling for your favorite php/python/ruby/java/other script that provides such solution. Maybe there is even a solution that is changing PHP's object into doc/odf/pdf format and let to convert one format into another.
Editing it in a browser that is much harder and requires a lot more work. But google docs shows it possible.
In few words, a lot of work, and some work already done by community. Need to a google bit to find that pieces.

How does disqus work?

Does anyone know how disqus works?
It manages comments on a blog, but the comments are all held on third-party site. Seems like a neat use of cross-site communication.

The general pattern used is JSONP
Its actually implemented in a fairly sophisticated way (at least on the jQuery site) ... they defer the loading of the disqus.js and thread.js files until the user scrolls to the comment section.
The thread.js file contains json content for the comments, which are rendered into the page after its loaded.

You have three options when adding Disqus commenting to a site:
Use one of the many integrated solutions (WordPress, Blogger, Tumblr, etc. are supported)
Use the universal JavaScript code
Write your own code to communicate with the Disqus API
The main advantage of the integrated solutions is that they're easy to set up. In the case of WordPress, for example, it's as easy as activating a plug-in.
Having the ability to communicate with the API directly is very useful, and offers two advantages over the other options. First, it gives you as the developer complete control over the markup. Secondly, you're able to process comments server-side, which may be preferable.

Looks like that using easyXDM library, which uses the best available way for current browser to communicate with other site.

Quoting Anton Kovalyov's (former engineer at Disqus) answer to the same question on a different site that was really helpful to me:
Disqus is a third-party JavaScript application that runs in your browser and injects itself on publishers' websites. These publishers need to install a small snippet of JavaScript code that makes the first request to our servers and loads initial JavaScript loader. This loader then creates all necessary iframe elements, gets the data from our servers, renders templates and injects the result into some element on the page.
As you can probably guess there are quite a few different technologies supporting what seems like a simple operation. On the back-end you have to run and scale a gigantic web application that serves millions of requests (mostly read). We use Python, Django, PostgreSQL and Redis (for our realtime service).
On the front-end you have to minimize your payload, make sure your app is super fast and that it doesn't break in extremely hostile environments (you will be surprised how screwed up publisher websites can be). Cross-domain communication—ability to send messages from hosting website to your servers—can be tricky as well.
Unfortunately, it is impossible to explain how everything works in a comment on Quora, or even in an article. So if you're interested in the back-end side of Disqus just learn how to write, run and operate highly-scalable websites and you'll be golden. And if you're interested in the front-end side, Ben Vinegar and myself (both front-end engineers at Disqus) wrote a book on the topic called Third-party JavaScript (http://thirdpartyjs.com/).
I'm planning to read the book he mentioned, I guess it will be quite helpful.
Here's also a link to the official answer to this question on the Disqus site.

short answer? AJAX, you get your own url eg "site.com/?comments=ID" included via javascript... but with real time updates like that you would need a polling server.

I think they keep the content on their site and your site will only send & receive the data to/from disqus. Now I wonder what happens if you decide that you want to bring your commenting in house without losing all existing comments!. How easy would you get to your data I wonder? They claim that the data belongs to you, but they have the control over it, and there is not much explanation on their site about this.

I'm always leaving comment in disqus platform. Sometimes, comment seems to be removed once you refreshed it and sometimes it's not. I think the one that was removed are held for moderation without saying it.

what is an alternate way to refer to HTML/JavaScript/CSS?

Often I need to refer to code written in HTML/JavaScript/CSS, but it is a very awkward construction to constantly refer to the descriptive trio of 'HTML/JavaScript/CSS' code.
for example, Mozilla refers to its HTML/JavaScript/CSS JetPack code as 'a JetPack'.
Other than the defunct 'dHTML', what are some concise, generic and accurate terms I can use to collectively refer to applications written in HTML/JavaScript/CSS.

I'm going to have to say DHTML anyway. Why would you say it's "defunct"? It is the perfect answer to this question. See http://en.wikipedia.org/wiki/DHTML. DHTML means Dynamic HTML—which is exactly what the combination of HTML/JavaScript/CSS code is.
Unless you're dealing with someone who isn't impressed with terms that are less than a year or two old, or unless you aren't specifically talking about code, DHTML conveys exactly what you are talking about.

Web application is perhaps too loose of a term, but it's a start.
Let's break it down.
HTML is data, CSS is presentation, and JavaScript is code. These are web technologies.
These are usually brought together by a browser.
Something in a browser on the web is a website.
JavaScript suggests it is somewhat interactive, so it's not just a site, it's an application.
("Application" also suggests that it's more complex, like with a SQL backend or something, so you might sound even more talented. :)
I'm guessing that you had the term LAMP (Linux, Apache, MySQL, and PHP) in mind? To my knowledge there is no such abbreviation for HTML, CSS, and JavaScript. The easiest way to say it is to just say it.
Versus "Front end" – I think that term implies that you built something that customers used. "Web application" is nonspecific about who the users are, so it would apply to customer-facing applications as well as internal-use applications. The word "application" implies that it's not just a tool; there are users who are not the programmers. "Front end" is probably more impressive because a customer-facing application has to be nicer than an internal one.
If you are not using it in a browser, or it's not actually on the web, maybe just your intranet or an internally distributed application bundle, it's still an application developed with web technologies.

Given that the person you're trying to convey this message to knows you're talking about web-related stuff - Front-end or Front-end development has always worked for me.

"HTML5" is the answer I now believe to be correct to replace "HTML/JavaScript/CSS". Since I asked the question in January, HTML5 has gained a lot more recognition for its incredible capabilities and promise. "HTML5" also has significantly greater name recognition than 7 months ago, and clearly sets it apart from lesser HTML.

I think the reason there's no specific term is the same reason that dHTML fell into disrepute - all three scripts are so integral to frontend development that there ceases to be a need to refer to them specifically. If you code in HTML, you almost necessarily use CSS, and if you have any active content at all it will most probably be in JavaScript.
Frontend development is a bit vague, but something like HTML based frontend development should get your point across.

If you want to refer to an application - use Web Application.
And if you need to refer to some code - use simple JS (JavaScript) because most of your code (except for some ie css expressions if you use it) will be in JS, isn't it?

Web Suite
suite: a set of things belonging together, in particular.
thus you have:
Web Suite: the set of HTML/CSS/JavaScript, the basic technologies used to develop a web site or application.
example:
"I used the Web Suite to make a cool website to show off all my pictures of cats sitting in boxes."

"UX" (User Experience) or "Front-end Development."

Web Applications, and Web 2.0 are both big names. One name/acronym that I personally like to use is RIA, or Rich Internet Application. From the article:
Rich Internet Applications (RIAs) are
web applications that have most of the
characteristics of desktop
applications, typically delivered
either by way of a standards based web
browser, via a browser plug-in, or
independently via sandboxes or virtual
machines.1 Examples of RIA
frameworks include Ajax, Curl, GWT,
Adobe Flash/Adobe Flex/AIR,
Java/JavaFX,[2] Mozilla's XUL,
OpenLaszlo and Microsoft
Silverlight.[3]
Also, someone else mentioned "impressing the suits," which this title tends to do. After all, it's got "rich" right in the name ;)

Web code
I was just having to write "HTML/Javascript/CSS" in an email and thought, "Isn't there a better term for this?". Googling, I found this post. I'm going with "web code".

Some call it a JAM stack, which stands for Javascript, API and Markup. But I acknowledge that it's not as specific as LAMP or something like that.
https://en.wikipedia.org/wiki/Solution_stack
https://jamstack.org/

However uncool it might be, it is still DHTML to me.
They are standard web technologies for producing dynamic websites and web applications. The last thing we need is another vacuous moniker for something that is more than adequately described by DHTML.

An Alternative to this Source or Page Source.
The Context all depends but for me this seemed to be a good name. When I right-click and I see "view page source" it seems relevant. it contains all of this HTML/CSS/JS.
I like Web Application but my use case was page-specific, not app-wide.

I've been calling this the web stack (HTML, CSS, JavaScript). Exclude frameworks or other tools, just the base technologies of what the web is made.

Develop Reference

JavaScript is the programming language of the Web.