I need to fetch a PDF file from s3.amazonaws.com and when I query it using Postman (or paste directly into the browser), it loads fine. However when I try to generate the file path for it (to pass to a viewer later), it didn't work:
fetch(<S3URL>).then(res => res.blob()).then(blob => {
// THIS STEP DOES NOT WORK
let myBlob = new Blob(blob, {type: 'application/pdf'});
// expect something like 'www.mysite.com/my-file.pdf'
let PDFLink = window.URL.createObjectURL(myBlob);
return PDFLink;
}
I'm using Autodesk's Forge PDF viewer and it works perfectly fine for local PDF files:
let myPDFLink = 'public/my-file.pdf';
Autodesk.Viewing.Initializer(options, () => {
viewer = new Autodesk.Viewing.Private.GuiViewer3D(document.getElementById('forgeViewer'));
viewer.start();
viewer.loadExtension('Autodesk.PDF').then( () => {
viewer.loadModel(myPDFLink, viewer); // <-- works fine here
});
});
// from https://github.com/wallabyway/offline-pdf-markup
So, how do I go from the S3 URL (e.g. s3.amazonaws.com/com.autodesk.oss-persistent/0d/ff/c4/2dfd1860d1...) to something the PDF viewer can understand (i.e. has .pdf extension in the URL)?
I know for JSON files I need to do res.json() to extract the JSON content, but for PDFs, what should I do with the res object?
Note: I don't have control over the S3 URL. Autodesk generates a temporary S3 link whenever I want to download documents from their BIM360 portal.
I tried a lot of options and the only way I could display a PDF fetched via API calls is by using an object element:
<object data='<PDF link>' type='application/pdf'>
Converting the downloaded blob to base64 doesn't work. Putting the PDF link in an iframe doesn't work either (it still downloads instead of displaying). All the options I have read only work if the PDFs are part of the frontend application (i.e. local files, not something fetched from a remote server).
I'm trying to write a program that dowloads OneNote pages to my pc, including files in the pages. I'm stuck on the downloading images from the pages. I make a GET request and get the binary data for the image just fine, when I save it and try to open it, I get a "it looks like we don't support this file format.
The code I'm using is
var u16 = btoa(unescape(encodeURIComponent(resp)));
var imgAsBlob = new Blob([u16], {type: 'application/octet-stream'});
var downloadLink = document.createElement("a");
downloadLink.download = "hello.png";
downloadLink.href = window.webkitURL.createObjectURL(imgAsBlob);
downloadLink.click();
resp is the responseText from the GET request with the binary data.
I've tried not using btoa and saving the resp directly on the blob. I've tried changing the blob type to image/png and I've tried escaping it using Uint16Array(resp.length) and equaling each byte to a byte from resp. I'm out of ideas and don't know what I'm doing wrong.
In my Vue app I receive a PDF as a blob, and want to display it using the browser's PDF viewer.
I convert it to a file, and generate an object url:
const blobFile = new File([blob], `my-file-name.pdf`, { type: 'application/pdf' })
this.invoiceUrl = window.URL.createObjectURL(blobFile)
Then I display it by setting that URL as the data attribute of an object element.
<object
:data="invoiceUrl"
type="application/pdf"
width="100%"
style="height: 100vh;">
</object>
The browser then displays the PDF using the PDF viewer. However, in Chrome, the file name that I provide (here, my-file-name.pdf) is not used: I see a hash in the title bar of the PDF viewer, and when I download the file using either 'right click -> Save as...' or the viewer's controls, it saves the file with the blob's hash (cda675a6-10af-42f3-aa68-8795aa8c377d or similar).
The viewer and file name work as I'd hoped in Firefox; it's only Chrome in which the file name is not used.
Is there any way, using native Javascript (including ES6, but no 3rd party dependencies other than Vue), to set the filename for a blob / object element in Chrome?
[edit] If it helps, the response has the following relevant headers:
Content-Type: application/pdf; charset=utf-8
Transfer-Encoding: chunked
Content-Disposition: attachment; filename*=utf-8''Invoice%2016246.pdf;
Content-Description: File Transfer
Content-Encoding: gzip
Chrome's extension seems to rely on the resource name set in the URI, i.e the file.ext in protocol://domain/path/file.ext.
So if your original URI contains that filename, the easiest might be to simply make your <object>'s data to the URI you fetched the pdf from directly, instead of going the Blob's way.
Now, there are cases it can't be done, and for these, there is a convoluted way, which might not work in future versions of Chrome, and probably not in other browsers, requiring to set up a Service Worker.
As we first said, Chrome parses the URI in search of a filename, so what we have to do, is to have an URI, with this filename, pointing to our blob:// URI.
To do so, we can use the Cache API, store our File as Request in there using our URL, and then retrieve that File from the Cache in the ServiceWorker.
Or in code,
From the main page
// register our ServiceWorker
navigator.serviceWorker.register('/sw.js')
.then(...
...
async function displayRenamedPDF(file, filename) {
// we use an hard-coded fake path
// to not interfere with legit requests
const reg_path = "/name-forcer/";
const url = reg_path + filename;
// store our File in the Cache
const store = await caches.open( "name-forcer" );
await store.put( url, new Response( file ) );
const frame = document.createElement( "iframe" );
frame.width = 400
frame.height = 500;
document.body.append( frame );
// makes the request to the File we just cached
frame.src = url;
// not needed anymore
frame.onload = (evt) => store.delete( url );
}
In the ServiceWorker sw.js
self.addEventListener('fetch', (event) => {
event.respondWith( (async () => {
const store = await caches.open("name-forcer");
const req = event.request;
const cached = await store.match( req );
return cached || fetch( req );
})() );
});
Live example (source)
Edit: This actually doesn't work in Chrome...
While it does set correctly the filename in the dialog, they seem to be unable to retrieve the file when saving it to the disk...
They don't seem to perform a Network request (and thus our SW isn't catching anything), and I don't really know where to look now.
Still this may be a good ground for future work on this.
And an other solution, I didn't took the time to check by myself, would be to run your own pdf viewer.
Mozilla has made its js based plugin pdf.js available, so from there we should be able to set the filename (even though once again I didn't dug there yet).
And as final note, Firefox is able to use the name property of a File Object a blobURI points to.
So even though it's not what OP asked for, in FF all it requires is
const file = new File([blob], filename);
const url = URL.createObjectURL(file);
object.data = url;
In Chrome, the filename is derived from the URL, so as long as you are using a blob URL, the short answer is "No, you cannot set the filename of a PDF object displayed in Chrome." You have no control over the UUID assigned to the blob URL and no way to override that as the name of the page using the object element. It is possible that inside the PDF a title is specified, and that will appear in the PDF viewer as the document name, but you still get the hash name when downloading.
This appears to be a security precaution, but I cannot say for sure.
Of course, if you have control over the URL, you can easily set the PDF filename by changing the URL.
I believe Kaiido's answer expresses, briefly, the best solution here:
"if your original URI contains that filename, the easiest might be to simply make your object's data to the URI you fetched the pdf from directly"
Especially for those coming from this similar question, it would have helped me to have more description of a specific implementation (working for pdfs) that allows the best user experience, especially when serving files that are generated on the fly.
The trick here is using a two-step process that perfectly mimics a normal link or button click. The client must (step 1) request the file be generated and stored server-side long enough for the client to (step 2) request the file itself. This requires you have some mechanism supporting unique identification of the file on disk or in a cache.
Without this process, the user will just see a blank tab while file-generation is in-progress and if it fails, then they'll just get the browser's ERR_TIMED_OUT page. Even if it succeeds, they'll have a hash in the title bar of the PDF viewer tab, and the save dialog will have the same hash as the suggested filename.
Here's the play-by-play to do better:
You can use an anchor tag or a button for the "download" or "view in browser" elements
Step 1 of 2 on the client: that element's click event can make a request for the file to be generated only (not transmitted).
Step 1 of 2 on the server: generate the file and hold on to it. Return only the filename to the client.
Step 2 of 2 on the client:
If viewing the file in the browser, use the filename returned from the generate request to then invoke window.open('view_file/<filename>?fileId=1'). That is the only way to indirectly control the name of the file as shown in the tab title and in any subsequent save dialog.
If downloading, just invoke window.open('download_file?fileId=1').
Step 2 of 2 on the server:
view_file(filename, fileId) handler just needs to serve the file using the fileId and ignore the filename parameter. In .NET, you can use a FileContentResult like File(bytes, contentType);
download_file(fileId) must set the filename via the Content-Disposition header as shown here. In .NET, that's return File(bytes, contentType, desiredFilename);
client-side download example:
download_link_clicked() {
// show spinner
ajaxGet(generate_file_url,
{},
(response) => {
// success!
// the server-side is responsible for setting the name
// of the file when it is being downloaded
window.open('download_file?fileId=1', "_blank");
// hide spinner
},
() => { // failure
// hide spinner
// proglem, notify pattern
},
null
);
client-side view example:
view_link_clicked() {
// show spinner
ajaxGet(generate_file_url,
{},
(response) => {
// success!
let filename = response.filename;
// simplest, reliable method I know of for controlling
// the filename of the PDF when viewed in the browser
window.open('view_file/'+filename+'?fileId=1')
// hide spinner
},
() => { // failure
// hide spinner
// proglem, notify pattern
},
null
);
I'm using the library pdf-lib, you can click here to learn more about the library.
I solved part of this problem by using api Document.setTitle("Some title text you want"),
Browser displayed my title correctly, but when click the download button, file name is still previous UUID. Perhaps there is other api in the library that allows you to modify download file name.
I am trying to implement these seemingly simple requirements but can't find a way :
Single Page App using Angular JS
REST(ish) back end
Back end resource exposed via POST request
Resource parameters passed as JSON in the request body
Resource produces a CSV file
When a user clicks a button, generate a request with the right JSON parameters in the body, send it, and allow user to download the response as a file (prompts the browser's "open / save as" dialog)
The problem is mainly, how to pass the JSON as request body? The most common technique seems to be the hidden HTML form to trigger the download, but an HTML form cannot send JSON data in the body. And I can't find any way to trigger a download dialog using an XMLHttpRequest...
Any ideas?
I specified Angular but any generic JS solution is very welcome too!
I finally found a solution that satisfies all my requirements, and works in IE11, FF and Chrome (and degrades kind of OK in Safari...).
The idea is to create a Blob object containing the data from the response, then force the browser to open it as a file. It is slightly different for IE (proprietary API) and Chrome/FF (using a link element).
Here is the implementation, as a small Angular service:
myApp.factory('Download', [function() {
return {
openAsFile : function(response){
// parse content type header
var contentTypeStr = response.headers('Content-Type');
var tokens = contentTypeStr.split('/');
var subtype = tokens[1].split(';')[0];
var contentType = {
type : tokens[0],
subtype : subtype
};
// parse content disposition header, attempt to get file name
var contentDispStr = response.headers('Content-Disposition');
var proposedFileName = contentDispStr ? contentDispStr.split('"')[1] : 'data.'+contentType.subtype;
// build blob containing response data
var blob = new Blob([response.data], {type : contentTypeStr});
if (typeof window.navigator.msSaveBlob !== 'undefined'){
// IE : use proprietary API
window.navigator.msSaveBlob(blob, proposedFileName);
}else{
var downloadUrl = URL.createObjectURL(blob);
// build and open link - use HTML5 a[download] attribute to specify filename
var a = document.createElement("a");
// safari doesn't support this yet
if (typeof a.download === 'undefined') {
window.open(downloadUrl);
}
var link = document.createElement('a');
link.href = downloadUrl;
link.download = proposedFileName;
document.body.appendChild(link);
link.click();
document.body.removeChild(link);
}
}
}
}]);
The response argument expects a $http response object. Here is an example of use with a POST request:
$http.post(url, {property : 'value'}, {responseType: 'blob'}).then(function(response){
Download.openAsFile(response);
});
Note the responseType parameter. Without this, my CSV data was being read as text and stored in memory as UTF-8 (or 16), and subsequently the file was saved in the same encoding, causing Excel to not recognize special characters such as éè etc. Since my CSVs are intended to be opened by Excel, the server encodes them Windows 1252, I wanted to keep them that way. Setting the responseType parameter to blob achieves this.
Disclaimer: It should work with any file type. But I tested it only with CSV files ! Binary files might behave somehow differently !
Using Windows Azure storage services, I have created a container and subsequently created a BlockBlob (a JPEG image) using the PUT Rest API. I can log into my Azure portal and download the image.
When I call the GET API Azure successfully returns me the blob in the response body -- and I think it's the same raw binary I uploaded.
When I call the GET API, I'm doing so via an XHR request in my JavaScript (Sencha Touch) application. I can see the response (the raw binary), but I cannot figure out how to read the binary into an image that I can display.
I've tried the following:
rawBinary = response.responseText;
encodedBinary = btoa(unescape(encodeURIComponent(rawBinary)));
img.setSrc('data:' + file.type + ';base64,' + encodedBinary);
...which gives me something like this:
data:image/jpeg;base64,77+977+977+977+9ABBKRklGAAEBAAABAAEAAO+/ve+/vQBYRXhpZgAATU0AKgAAAAgAAgESAAMAAAABAAEAAO+/vWkABAAAAAEAAAAmAAAAAAAD77+9AQADAAAAAQABAADvv70CAAQAAAABAAAKIO+/vQMABAAAAAEAAAfvv70AAAAA77.......
This correctly sets a background URL on a DIV as a base64 encoded image... but nothing displays. It looks like a valid base64 string, and there are no errors in my console or network tabs. But nothing shows.
Can anyone help?
EDIT: Below is what the "binary" response looks like in the XHR response body:
����JFIF��XExifMM*�i&��
����C ��C��� "��
���}!1AQa"q2���#B��R��$3br�
...etc... VERY long response of unreadable characters
Can you try this jsfiddle by Vlad - http://jsfiddle.net/79NnG/
function hexToBase64(str) {
return btoa(String.fromCharCode.apply(null, str.replace(/\r|\n/g, "").replace(/([\da-fA-F]{2}) ?/g, "0x$1 ").replace(/ +$/, "").split(" ")));
}
var img = new Image();
img.src = "data:image/jpeg;base64,"+hexToBase64(getBinary());
alert(hexToBase64(getBinary()));
document.body.appendChild(img);
Also this post should help you - How to display binary data as image - extjs 4