converting html to pdf

converting html to pdf - javascript

I am trying to convert html to pdf from linux ,also i have to use this in web APP please let me know what tools are available for this.Please let me know any other tools for this
So far i have tried
html2ps htmlfilename > a.ps
ps2pdf a.ps > a.pdf
But the above doesnt convert images and is ignoring css .My Development environment is linux(RHEL5)
Also i have tried http://www.webupd8.org/2009/11/convert-html-to-pdf-linux.html i get this error
[root#localhost bin]# ./wkhtmltopdf www.example.com a.pdf
./wkhtmltopdf: error while loading shared libraries: libQtWebKit.so.4: cannot open shared object file: No such file or directory

You are on the right path: wkhtmltopdf is the easiest way to do this. Note that the code in the repositories might be outdated (not sure how up-to date this package is); you may need to compile it from source, or get the statically-linked version (which is huge, but has the QT library and other dependencies already included).
Also, in your case, you may just be missing a library - installing libqt4-webkit-dev might do the trick here.

Two ways that are easy to implement and suitable to convert HTML+CSS to pdf are.
1) Using "Jspdf javascript" plugin with "html2canvas plugin" (Web App).
Insert stable version of jspdf plugin.
var script = document.createElement('script');
script.type = 'text/javascript';
script.src ='https://cdnjs.cloudflare.com/ajax/libs/jspdf/1.0.272/jspdf.min.js';
document.head.appendChild(script);
Insert html2canvas plugin
var script = document.createElement('script');
script.type = 'text/javascript';
script.src = 'https://cdnjs.cloudflare.com/ajax/libs/html2canvas/0.4.1/html2canvas.js';
document.head.appendChild(script);
Insert the following script
var html2obj = html2canvas($('your div class here'));
var queue = html2obj.parse();
var canvas = html2obj.render(queue);
var img = canvas.toDataURL("image/jpg");
console.log(img);
var doc=new jsPDF("p", "mm", "a4");
var width = doc.internal.pageSize.width;
var height = doc.internal.pageSize.height;
doc.addImage(canvas, 'JPEG', 15, 35, 180, 240,'SLOW');
doc.save("save.pdf");
Special Case for IE 11
document.getElementById("your div here").style.backgroundColor = "#FFFFFF";
2) Using wkhtmltopdf
Install wkhtmltopdf from here
we can directly use wkhtmltopdf from terminal/commandLine , However in case of java language we have a wrapper which we can use.
Code Example using wkhtmltopdf wrapper
import com.github.jhonnymertz.wkhtmltopdf.wrapper.Pdf;
import com.github.jhonnymertz.wkhtmltopdf.wrapper.page.PageType;
import com.github.jhonnymertz.wkhtmltopdf.wrapper.params.Param;
public class PofPortlet extends MVCPortlet {
#Override
public void render(RenderRequest request , RenderResponse response) throws PortletException , IOException
{ super.render(request, response);
Pdf pdf = new Pdf();
pdf.addPage("http://www.google.com", PageType.url);
// Add a Table of contents
pdf.addToc();
// The "wkhtmltopdf" shell command accepts different types of options such as global, page, headers and footers, and toc. Please see "wkhtmltopdf -H" for a full explanation.
// All options are passed as array, for example:
// Save the PDF
try {
pdf.saveAs("E:\\output.pdf");
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
3) Other tools include phantom.js , itextpdf , grabz.it

Probably the easiest way would be to launch any modern browser, go to the site, and then use the browser's "print" capability to print to a pdf (assuming your system has a pdf printer set up). I don't know if that's an option in your case, though, and this sort of thing won't work from within a web app. Still, you may want to try it.

Related

Name html blob urls for easy reference

Within our web application we load a lot of content from package files (zipped packages containing html, js, css, images and so on.) The module loader (client side JS) processes the packages and makes the content available to the DOM using blob urls.
While this works very nice, it's sometimes tedious to find the right piece of JavaScript file for debugging.
IE: in chrome in the development console->sources all blob urls are listed under (no domain) and have random names such as:
blob:https://example.com/0613efd7-6977-4872-981f-519eea0bc911
In a normal production environment there are roughly 100 lines like this, so finding the right one might take some time.
I'd pretty much like to name the blob urls, or do something to make them easier to find for debugging purposes. This seems possible since WebPack is doing something like this, however i can't seem to find how. Is there anybody that can hint me in the right direction.

Ok, the way I would do it is have some global that keeps a track of the URL's, using a simple reverse map.
One problem of course with this is that references to a blob that no longer exists will be kept in memory, but if say you was only enabling this for debugging purposes this might not be a problem.
var namedblobs = {};
function addNamedBlob(name, uri) {
namedblobs[uri] = name;
}
function getNamedBlob(uri) {
return namedblobs[uri];
}
function createSomeBlob() {
//for testing just a random number would do
return Math.random().toString();
}
var blob = createSomeBlob();
addNamedBlob("test1", blob);
addNamedBlob("test2", createSomeBlob());
console.log(getNamedBlob(blob)); //should be test1

Finally i have found a solution that works to my liking. For our application we already used a serviceworker which has caching active. So i ended up writing the module files into the serviceworker cache whenever somebody has debug mode turned on.
Since the url portion of the resource files is static this way, all the nice browser features such as breakpoints are now useable again.
Below i've posted the relevant code of the serviceworker. The rest of the code is just plain serviceworker caching.
api.serveScript = function(module, script, content){
try{
content = atob(content);
} catch(err){}
return new Promise(function(resolve, reject){
var init = {
status: 200,
statusText: "OK",
headers: {'Content-Type': 'text/javascript'}
};
caches.open("modulecache-1").then(function(cache) {
console.log('[ServiceWorker] Caching ' + module + "/" + script);
cache.put("/R/" + module + "/script/" + script, new Response(content, init));
resolve("/R/" + module + "/script/" + script);
});
});
}
Thanks for your answers and help. I hope this solution is going to help some others too.

#Keith's option is probably the best one. (create a Map of your blobURIs and easy to read file names).
You could also do a dynamic router that will point some nice url to the blobURIs, but if you are open to do this, then just don't use blobURIs.
An other hackish workaround, really less cleaner than the Map, would be to append a fragment identifier to your blobURI blob:https://example.com/0613efd7-6977-4872-981f-519eea0bc911#script_name.js.
Beware, This should work for application/javascript Blobs or some other resource types, but not for documents (html/svg/...) where this fragment identifier has a special meaning.
var hello = new Blob(["alert('hello')"], {type:'application/javascript'});
var script = document.createElement('script');
script.src = URL.createObjectURL(hello) + '#hello.js';
document.head.appendChild(script);
console.log(script.src);
var css = new Blob(["body{background:red}"], {type:'text/css'});
var style = document.createElement('link');
style.href = URL.createObjectURL(css) + '#style.css';
style.rel = 'stylesheet';
document.head.appendChild(style);
console.log(style.href);
And as a fiddle for browsers which doesn't like null origined StackSnippet's iframes.

load webpage completely in C# (contains page-load scripts)

I'm trying to load a webpage in my application background. following code shows How I am loading a page:
request = (HttpWebRequest)WebRequest.Create("http://example.com");
request.CookieContainer = cookieContainer;
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
Stream st = response.GetResponseStream();
StreamReader sr = new StreamReader(st);
string responseString = sr.ReadToEnd();
sr.Close();
st.Close();
}
as you know, the server responses HTML codes or some javascript codes, but there are many codes which added to the webpage by javascripts functions. so I have to interpret or compile the first HTTP response.
I tried to use System.Windows.Forms.WebBrowser object to load the webpage completely, but this is a weak engine to do this.
so I tried to use CEFSharp (Chromium embedded Browser), it's great and works fine but I have trouble with that. following is how I use CEFSharp to load a webpage:
ChromiumWebBrowser MainBrowser = new ChromiumWebBrowser("http://Example/");
MainBrowser.FrameLoadEnd+=MainBrowser.FrameLoadEnd;
panel1.Controls.Add(MainBrowser);
MainBrowser.LoadHtml(responseString,"http://example.com");
it works fine when I use this code in Form1.cs and when I add MainBrowser to a panel. but I want to use it in another class, actually ChromiumWebBrowser is part of another custom object and the custom object works in background. also it would possible 10 or 20 custom objects work in a same time. in this situation ChromiumWebBrowser doesn't work any more!
second problem is the threading issue, when I call this function MainBrowser.LoadHtml(responseString,"http://example.com");
it doesn't return any results, so I have to pause the code execution by using Semaphore and wait for the result at this event: MainBrowser.FrameLoadEnd
so I wish my code be some thing like this:
request = (HttpWebRequest)WebRequest.Create("http://example.com");
request.CookieContainer = cookieContainer;
string responseString="";
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
Stream st = response.GetResponseStream();
StreamReader sr = new StreamReader(st);
responseString = sr.ReadToEnd();
sr.Close();
st.Close();
}
string FullPageContent = SomeBrowserEngine.LoadHtml(responseString);
//Do stuffs
Can you please show me how to do this? do you know any other web browser engines that work like what I want?
please tell me if I'm doing any things wrong with CEFSharp or other concepts.

Unity WebGL External Assets

I'm developing some webGL project in Unity that has to load some external images from a directory, it runs all fine in the editor, however when I build it, it throws a Directory Not Found exception in web console. I am putting the images in Assets/StreamingAssets folder, that will become StreamingAssets folder in the built project (at root, same as index.html). Images are located there, yet browser still complains about not being able to find that directory. (I'm opening it on my own computer, no running web server)
I guess I'm missing something very obvious, but it seems like I could use some help, I've just started learning unity a week ago, and I'm not that great with C# or JavaScript (I'm trying to get better...) Is this somehow related to some javascript security issues?
Could someone please point me in the right direction, how I should be reading images(no writing need to be done) in Unity WebGL?
string appPath = Application.dataPath;
string[] filePaths = Directory.GetFiles(appPath, "*.jpg");
According to unity3d.com in webGL builds everything except threading and reflection is supported, so IO should be working - or so I thought:S
I was working around a bit and now I'm trying to load a text file containing the paths of the images (separated by ';'):
TextAsset ta = Resources.Load<TextAsset>("texManifest");
string[] lines = ta.text.Split(';');
Then I convert all lines to proper path, and add them to a list:
string temp = Application.streamingAssetsPath + "/textures/" + s;
filePaths.Add(temp);
Debug.Log tells me it looks like this:
file://////Downloads/FurnitureDresser/build/StreamingAssets/textures/79.jpg
So that seems to be allright except for all those slashes (That looks a bit odd to me)
And finally create the texture:
WWW www = new WWW("file://" + filePaths[i]);
yield return www;
Texture2D new_texture = new Texture2D(120, 80);
www.LoadImageIntoTexture(new_texture);
And around this last part (unsure: webgl projects does not seem easily debuggable) it tells me: NS_ERROR_DOM_BAD_URI: Access to restricted URI denied
Can someone please enlighten me what is happening? And most of all, what would be proper to solution to create a directory from where I can load images during runtime?

I realise this question is now a couple of years old, but, since this still appears to be commonly asked question, here is one solution (sorry, the code is C# but I am guessing the javascript implementation is similar). Basically you need to use UnityWebRequest and Coroutines to access a file from the StreamingAssets folder.
1) Create a new Loading scene (which does nothing but query the files; you could have it display some status text or a progress bar to let the user knows what is happening).
2) Add a script called Loader to the Main Camera in the Loading scene.
3) In the Loader script, add a variable to indicate whether the asset has been read successfully:
private bool isAssetRead;
4) In the Start() method of the Loading script:
void Start ()
{
// if webGL, this will be something like "http://..."
string assetPath = Application.streamingAssetsPath;
bool isWebGl = assetPath.Contains("://") ||
assetPath.Contains(":///");
try
{
if (isWebGl)
{
StartCoroutine(
SendRequest(
Path.Combine(
assetPath, "myAsset")));
}
else // desktop app
{
// do whatever you need is app is not WebGL
}
}
catch
{
// handle failure
}
}
5) In the Update() method of the Loading script:
void Update ()
{
// check to see if asset has been successfully read yet
if (isAssetRead)
{
// once asset is successfully read,
// load the next screen (e.g. main menu or gameplay)
SceneManager.LoadScene("NextScene");
}
// need to consider what happens if
// asset fails to be read for some reason
}
6) In the SendRequest() method of the Loading script:
private IEnumerator SendRequest(string url)
{
using (UnityWebRequest request = UnityWebRequest.Get(url))
{
yield return request.SendWebRequest();
if (request.isNetworkError || request.isHttpError)
{
// handle failure
}
else
{
try
{
// entire file is returned via downloadHandler
//string fileContents = request.downloadHandler.text;
// or
//byte[] fileContents = request.downloadHandler.data;
// do whatever you need to do with the file contents
if (loadAsset(fileContents))
isAssetRead = true;
}
catch (Exception x)
{
// handle failure
}
}
}
}

Put your image in the Resources folder and use Resources.Load to open the file and use it.
For example:
Texture2D texture = Resources.Load("images/Texture") as Texture2D;
if (texture != null)
{
GetComponent<Renderer>().material.mainTexture = texture;
}
The directory listing and file APIs are not available in webgl builds.
Basically no low level IO operations are supported.

Download WebView content in WInRT application

I'm trying to build a universal rss application for Windows 10 that could be able to download the content of the full article's page for offline consultation.
So after spending a lot of time on stackoverflow I've found some code:
HttpClientHandler handler = new HttpClientHandler { UseDefaultCredentials = true, AllowAutoRedirect = true };
HttpClient client = new HttpClient(handler);
HttpResponseMessage response = await client.GetAsync(ni.Url);
response.EnsureSuccessStatusCode();
string html = await response.Content.ReadAsStringAsync();
However this solution doesn't work on some web page where the content is dynamically called.
So the alternative that remains seems to be that one: load the web page into the Webview control of WinRT and somehow copy and paste the rendered text.
BUT, the Webview doesn't implement any copy/paste method or similar so there is no way to do it easily.
And finally I found this post on stackoverflow (Copying the content from a WebView under WinRT) that seems to be dealing with the same exact problematic as mine with the following solution;
Use the InvokeScript method from the webview to copy and paste the content through a javascript function.
It says: "First, this javascript function must exist in the HTML loaded in the webview."
function select_body() {
var range = document.body.createTextRange();
range.select();
}
and then "use the following code:"
// call the select_body function to select the body of our document
MyWebView.InvokeScript("select_body", null);
// capture a DataPackage object
DataPackage p = await MyWebView.CaptureSelectedContentToDataPackageAsync();
// extract the RTF content from the DataPackage
string RTF = await p.GetView().GetRtfAsync();
// SetText of the RichEditBox to our RTF string
MyRichEditBox.Document.SetText(Windows.UI.Text.TextSetOptions.FormatRtf, RTF);
But what it doesn't say is how to inject the javascript function if it doesn't exist in the page I'm loading ?

If you have a WebView like this:
<WebView Source="http://kiewic.com" LoadCompleted="WebView_LoadCompleted"></WebView>
Use InvokeScriptAsync in combination with eval() to get the document content:
private async void WebView_LoadCompleted(object sender, NavigationEventArgs e)
{
WebView webView = sender as WebView;
string html = await webView.InvokeScriptAsync(
"eval",
new string[] { "document.documentElement.outerHTML;" });
// TODO: Do something with the html ...
System.Diagnostics.Debug.WriteLine(html);
}

How to read and write to a file (Javascript) in ui automation?

I want to identify few properties during my run and form a json object which I would like to write to a ".json"file and save it on the disk.
var target = UIATarget.localTarget();
var properties = new Object();
var jsonObjectToRecord = {"properties":properties}
jsonObjectToRecord.properties.name = "My App"
UIALogger.logMessage("Pretty Print TEST Log"+jsonObjectToRecord.properties.name);
var str = JSON.stringify(jsonObjectToRecord)
UIALogger.logMessage(str);
// -- CODE TO WRITE THIS JSON TO A FILE AND SAVE ON THE DISK --
I tried :
// Sample code to see if it is possible to write data
// onto some file from my automation script
function WriteToFile()
{
set fso = CreateObject("Scripting.FileSystemObject");
set s = fso.CreateTextFile("/Volumes/DEV/test.txt", True);
s.writeline("HI");
s.writeline("Bye");
s.writeline("-----------------------------");
s.Close();
}
AND
function WriteFile()
{
// Create an instance of StreamWriter to write text to a file.
sw = new StreamWriter("TestFile.txt");
// Add some text to the file.
sw.Write("This is the ");
sw.WriteLine("header for the file.");
sw.WriteLine("-------------------");
// Arbitrary objects can also be written to the file.
sw.Write("The date is: ");
sw.WriteLine(DateTime.Now);
sw.Close();
}
But still unable to read and write data to file from ui automation instruments
Possible Workaround ??
To redirect to the stdout if we can execute a terminal command from my ui automation script. So can we execute a terminal command from the script ?
Haven't Tried :
1. Assuming we can include the library that have those methods and give it a try .

Your assumptions are good, But the XCode UI Automation script is not a full JavaScript.
I don't think you can simply program a normal browser based JavaScript in the XCode UI Automation script.
set fso = CreateObject("Scripting.FileSystemObject");
Is not a JavaScript, it is VBScript which will only work in Microsoft Platforms and testing tools like QTP.
Scripting.FileSystemObject
Is an ActiveX object which only exists in Microsoft Windows
Only few JavaScript functions like basic Math, Array,...etc..Are provided by the Apple JavaScript library, so you are limited to use only the classes provided here https://developer.apple.com/library/ios/documentation/DeveloperTools/Reference/UIAutomationRef/
If you want to do more scripting then Try Selenium IOS Driver http://ios-driver.github.io/ios-driver/

Hey so this is something that I was looking into for a project but never fully got around to implementing so this answer will be more of a guide of what to do than step by step copy and paste.
First you're going to need to create a bash script that writes to a file. This can be as simple as
!/bin/bash
echo $1 >> ${filename.json}
Then you call this from inside your Xcode Instruments UIAutomation tool with
var target = UIATarget.localTarget();
var host = target.host();
var result = host.performTaskWithPathArgumentsTimeout("your/script/path", ["Object description in JSON format"], 5);
Then after your automation ends you can load up the file path on your computer to look at the results.
EDIT: This will enable to write to a file line by line but the actual JSON formatting will be up to you. Looking at some examples I don't think it would be difficult to implement but obviously you'll need to give it some thought at first.

Develop Reference

JavaScript is the programming language of the Web.

converting html to pdf - javascript

Related

Name html blob urls for easy reference

load webpage completely in C# (contains page-load scripts)

Unity WebGL External Assets

Download WebView content in WInRT application

How to read and write to a file (Javascript) in ui automation?

Categories

Resources