Angular Download Large blobs

Angular Download Large blobs - javascript

I have an issue similar to this one where I am successfully downloading a blob generated from a backend via HTTP GET but the file is being saved to browser memory before the download begins.
There's no problem when downloading small files but it doesn't immediately download 100mb+ files.
Subscribing to the GET itself is causing the delay of saving the large files.
I'm using Angular 6 with an object store backend. Here's the download function:
finalDownload(url: string) {
let headers = new HttpHeaders();
headers = headers.append('X-Auth-Token', token);
return this.http.get(url, { headers, responseType: 'blob' })
.subscribe(response => {
saveAs(response);
})
}
Here's the process:
User hits the download button
GET request with headers is fired to back end
As soon as I subscribe for the response, the blob is stored in browser memory.
When the blob is completely stored in browser, the saveAs/download begins
Step 3 is where the issue is.
This devtools screenshot with 108 MB transferred accumulates to the file size (I downloaded a 100 mb file) before the download itself to filesystem begins.

You can try to use URL.createObjectURL:
URL.createObjectURL() can be used to construct and parse URLs. URL.createObjectURL() specifically, can be used to create a reference to a File or a Blob. As opposed to a base64-encoded data URL, it doesn’t contain the actual data of the object – instead it holds a reference.
The nice thing about this is that it’s really fast. Previously, we’ve had to instantiate a FileReader instance and read the whole file as a base64 data URL, which takes time and a lot of memory. With createObjectURL(), the result is available straight away, allowing us to do things like reading image data to a canvas.
Use the following code as reference
const blob = new Blob([data], { type: 'application/octet-stream' });
this.fileUrl = this.sanitizer.bypassSecurityTrustResourceUrl(window.URL.createObjectURL(blob));

Related

NestJS StreamableFile is changing bytes when streaming a pdf

I'm trying to download a pdf file from an s3 bucket, however for security reasons the file must be streamed back to the client (an angular web app). My plan was to request a download link from s3 and stream the document from that link back to the client using NestJS StreamableFile.
My Controller:
#Post('download')
#ApiOkResponse({
schema: {
type: 'string',
format: 'binary'
}
})
#ApiProduces('application/pdf')
async documentDownload(#Body() body: DownloadURLRequest) {
const result = await this.service.getS3DownloadURL(body);
const got = require('got');
return new StreamableFile(got.stream(result.downloadUrl));
}
My client receives a Blob that is exactly the length of the original file, however the file fails to load. Upon closer inspection it turns out some of (roughly 2/3) the bytes are different.
After a bit of debugging I know for sure the got.stream is generating the correct bytes, so it would appear that StreamableFile is changing or misinterpreting some of the bytes for some reason. Anyone with experience using StreamableFile know why this might be? Or if there is a better way to handle this with NestJS or otherwise I'm open to suggestions.
Edit: Further testing -- printing the stream from the StreamableFile object on the API side shows correct bytes, however the file returned to swagger has incorrect bytes.

Generate HTTP response from raw file data in JavaScript

I need to be able to take the raw data from any file and generate a response object within a service worker.
Short explanation:
I have a website that takes file names, paths, mime types and raw text and stores it in cache. If I make a request to a file with that path and name, a service worker responds with that raw data.
Here is the very basic service worker response code:
self.addEventListener("fetch", async event =>
event.respondWith(caches.match(event.request))
);
This system works fine for HTML, CSS, JS and probably other files, but not for PNGs. I keep getting the image not found image:
I have checked that the correct mime type is being sent and the correct data is stored in cache. I have tried putting the data in cache with the text I find in notepad after opening the PNG, and the text result of a fetch request to an actual PNG file, using the .text() method.
Here are partial images of both:
Notepad++:
Fetch:
This data in the images is put into cache with this code:
cache.put(
rawFileText,
new Response(
"filePath/example.html",
{
status: 200,
headers: new Headers({
"content-type": "image/png" + "; charset=utf-8",
}),
},
),
);
More background info:
This is for a web code editor I am working on. When I want to run code in the editor:
website1 will create an iframe with website2 as source
website1 post messages all the file names, paths, mime types and raw text containing code, images, etc. (by raw data I mean whatever you would see if you opened the file with notepad)
then the iframe (website2) stores the file names and data as cache
when a request is made to any file stored in website2's cache, the service worker responds with whatever data is under the file name
The reason I use 2 different websites is to avoid conflicts with the editor website's files, local storage, and everything else. There could still be conflict with the second website's cache and service worker but that isn't a problem in my case.

Get Blob object back from blob URL created with URL.createObjectURL

Every question I found told me that the only way to get back the object is to fetch it with an ajax request using the blob:https://www.example.com/0ea6c8a8-732f-42c7-9530-4805c4e785f5 as the destination url. Are blobs not saved in my browsers memory and should therefore be immediately accessible? The way I understand it, it has nothing to do with the remote server/website. Some JS file created the blob object, generated the blob URL and saved it in memory.
I tried using let blob = await fetch(url).then(r => r.blob()); on several websites, always running in cors limitations. Perhaps only the script that created it (different domain) is allowed to access it within its context, which is very unfortunate considering the blob content is literally saved in my browser's memory.
I know other methods of accessing the resource the blob points to or contains by observing network requests. That is not what I am asking here. I wish to understand, how to unpack the blob URL to access Blob object inside browser console to see how the information was saved in the first place. When it comes to videos, Blob simply can't contain the actual video, because of size and bandwidth constrains, so what does it contain then? Manifest file itself?

See this answer of mine for a way to retrieve the Blob from a blob: URI (you need to run the script there before the Blob is created by the page). Fetching only creates a copy.
The blob: URL is linked to the remote server in that it shares the same origin. So yes, the Blob (binary data) is on your computer, but the URL is only accessible to scripts running from the same origin than the one it was generated from.
And yes, the Blob does contain all the video data, but the string you have is a blob: URL, which is only a pointer to that Blob, itself stored in the browser's memory.

Wrong encoding on JavaScript Blob when fetching file from server

Using a FileStreamResult from C# in a SPA website (.NET Core 2, SPA React template), I request a file from my endpoint, which triggers this response in C#:
var file = await _docService.GetFileAsync(token.UserName, instCode.Trim()
.ToUpper(), fileSeqNo);
string contentType = MimeUtility.GetMimeMapping(file.FileName);
var result = new FileStreamResult(file.File, contentType);
var contentDisposition = new ContentDispositionHeaderValue("attachment");
Response.Headers[HeaderNames.ContentDisposition] =
contentDisposition.ToString();
return result;
The returned response is handled using msSaveBlob (spesificly for MS, but this is a problem even though I use createObjectURL and different browser (Yes, I have tried multiple solutions to this, but none of them seems to work). This is the code I use to send the request, and receive the PDF FileStreamResult from the server.
if (window.navigator.msSaveBlob) {
axios.get(url).then(response => {
window.navigator.msSaveOrOpenBlob(
new Blob([response.data], {type: "application/pdf"}),
filename);
});
The problem is that the returned PDF file that I get has a wrong encoding on it somehow. So the PDF will not open.
I have tried adding encoding to the end of type: {type: "application/pdf; encoding=UTF-8"} which was suggested in different posts, however, it makes no difference.
Comparing a PDF file that I have fetched in a different way, I can clearly see that the encoding is wrong. Most of the special characters are not correct. Indicated by the response header, the PDF file should be in UTF-8, but I have no idea how to actually find out and check.

Without knowing axios it seems though from its readme page that it uses JSON as default responseType. This may potentially alter the content as it is now treated as text (axios will probably bail out when it cannot convert to an actual JSON object and keep the string/text source for response data).
A PDF should be loaded as binary data even though it can be both, either 8-bit binary content or 7-bit ASCII - both should in any case be treated as a byte stream, from Adobe PDF reference sec. 2.2.1:
PDF files are represented as sequences of 8-bit binary bytes.
A PDF file is designed to be portable across all platforms and
operating systems. The binary rep resentation is intended to be
generated, transported, and consumed directly, without translation
between native character sets, end-of-line representations, or other
conventions used on various platforms. [...].
Any PDF file can also be represented in a form that uses only 7-bit
ASCII [...] character codes. This is useful for the purpose of
exposition, as in this book. However, this representation is not
recommended for actual use, since it is less efficient than the normal
binary representation. Regardless of which representation is
used, PDF files must be transported and stored as binary files,
not as text files. [...]
So to solve the conversion that happens I would suggest trying specifying the configuration entry responseType when doing the request:
axios.get(url, {responseType: "arraybuffer"}) ...
or in this form:
axios({
method: 'get',
url: url,
responseType:'arraybuffer'
})
.then( ... )
You can also go directly to response-type blob if you are sure the mime-type is preserved in the process.

FileWriter API: use blob to write data

i have a blob url like blob:blahblah that points to a file. I want to write the file behind this blob to local filesystem. The writer.write() documentation says it accepts a file object (from input-type-file) and a blob. But it throws a type mismatch error when try this
fileEntry.createWriter(function(writer) {
writer.write(blob); //blob is a var with the value set to the blob url
i know the problem is that the blob does not get accepted but i would like to know how can i store a blob to the filesystem. i created the said blob earlier in the script from input-type-file and stored it's value in a var.
EDIT
Ok so i think i should have given more code in the first place.
first i create a blob url and store it in a var like this
files[i]['blob'] = window.webkitURL.createObjectURL(files[i]);
files is from an input-type-file html tag and i is looped for number of files. you know the gig.
then the variable goes through a number of mediums, first through chrome's message passing api to another page and then from that page to a worker via postMessage and then finally back to the parent page via postMessage again.
on the final page i intend to use it to store the blob's file to local file system via file system api like this..
//loop code
fileSystem.root.getFile(files[i]['name'], {create: true}, function(fileEntry) {
fileEntry.createWriter(function(writer) {
writer.write(files[i]['blob']);
});
});
//loop code
but the writer.write throws Uncaught Error: TYPE_MISMATCH_ERR: DOM File Exception 11
i believe this error is because the variable supplied to writer.write is a text and not a blob object from something like createObjectUrl (directly and not after passing through multiple pages/scopes) or not a window.WebKitBlobBuilder. So how can a blob's url be used to store a file?

From your edited code snippet and description, it sounds like you're writing the blobURL to the filesystem rather than the File itself (e.g. files[i]['name'] is a URL). Instead, pass around the File object between main page -> other page -> worker -> main page. As of recent (in Chrome at least), your round trip is now possible. File objects can be passed to window.postMessage(), whereas before, the browser serialized the argument into a string.
You 'fashion' a handler/reference to a Blob with createObjectURL(). There's not really a way to go from blobURL back to a Blob. So in short, no need to create createObjectURL(). Just pass around files[i] directly.

Develop Reference

JavaScript is the programming language of the Web.