I am aware that a DOCX file is essentially a zip full of XML files. Is there any simple way of using the chrome.fileSystem storage API to save a DOCX file, or would I have to create and package the XML files manually?
chrome.fileSystem.chooseEntry({
type: 'saveFile',
accepts: [
{ extensions: ['docx'] },
{ extensions: ['txt'] },
]
}, function(writableFileEntry) {
var ext = writableFileEntry.name.substr(writableFileEntry.name.lastIndexOf('.') + 1);
var text = document.getElementById("textarea").innerText;
if(ext == 'docx'){
... ?
}
else if(ext == 'txt'){
writableFileEntry.createWriter(function(writer) {
writer.write(new Blob([text], {type: 'text/plain'}));
}, function(){
window.alert('Saving file failed');
});
}
});
fileSystem API has nothing to do with specific file formats or compression - this is outside its scope. It's providing raw file access.
To do any kind of content formatting, you need to look for extra libraries or do it yourself, and then feed the resulting binary to fileSystem.
Related
I am writing a site using the remix JavaScript framework.
It is a blog that uses MDX for the blog content. I want to include images within this content.
Background
I am using a combination of mdx-bundler to convert the MDX files into bundles, and the remark-mdx-images plugin to convert the markdown-style image references into image tags.
The Problem:
mdx-bundler is correctly building my Markdown, and is outputting the Image assets into the expected directory. However, the output images appear to be corrupt. I cannot open them in macos Finder. In-browser, requests for these images returns a 200, but the image is not visible.
Here is an example MDX file:
---
slug: hello-world
title: Hello, world
---
testing images
![](./hello-world.png 'Hello World')
Below is the relevant portion of remix config where these two things are used:
const { code, frontmatter } = await bundleMDX({
source: indexFile.content,
files: filesObject,
mdxOptions: options => ({
remarkPlugins: [
...(options.remarkPlugins ?? []),
remarkMdxImages,
remarkSlug,
remarkGfm,
],
rehypePlugins: [
...(options.rehypePlugins ?? []),
[rehypeAutolinkHeadings, { behavior: 'wrap' }],
[rehypePrettyCode, rehypePrettyCodeOptions],
],
}),
esbuildOptions: options => {
options.loader = {
...options.loader,
'.gif': 'file',
'.jpeg': 'file',
'.jpg': 'file',
'.png': 'file',
'.svg': 'file',
'.webp': 'file',
}
// Set write to true so that esbuild will output the files.
options.write = true
options.outdir = path.resolve(`public/build/_assets/${rootDir}/img`)
options.publicPath = path.resolve(`/build/_assets/${rootDir}/img`)
return options
},
})
I'm not entirely sure how to go about debugging this as I am new to MDX, mdx-bundler and remark-mdx-images. All of the output files and locations appear to be correct, it's just the image assets themselves that are corrupted.
I have a short script to OCR jpg files by converting them to GDOCS. It works fine for JPGs around 5MB. But for a 600dpi scan, where the file size is more like 15MB I get the following error for a single image:
5:40:58 PM Notice Execution started
5:41:03 PM Error
GoogleJsonResponseException: API call to drive.files.insert failed with error: Request Too Large
convertJpegToGdoc # convertJpegToGdoc.gs:27
The relevant line of code is:
Drive.Files.insert({title: file.getName(), parents: [{id: dstFolderId}]
}
I am aware of quotas Quotas for Google Services The error I am getting is not one of these. The time shows that the script is not exceeding the 6 mins listed in the docs. BTW I can convert multiple images, each approx 1.5MB, with 24 character JPG file basenames, into gdocs without problems using this script.
The Google Drive API for insert docs javascript example suggests, perhaps, that I may need to upgrade my code to handle larger files. But I am not sure where to start.
Any suggestion appreciated.
Full code:
// this function does OCR while copying from ocrSource to ocrTarget
function convertJpegToGdoc() {
var files = DriveApp.getFolderById(srcFolderId).getFilesByType(MimeType.JPEG);
while (files.hasNext()) { var file = files.next();
Drive.Files.insert({title: file.getName(), parents: [{id: dstFolderId}]
},
file.getBlob(), {ocr: true});
}
// this moves files from ocrSource to ocrScriptTrash
// handy for file counting & keeping ocrSource free for next batch of files
var inputFolder = DriveApp.getFolderById(srcFolderId);
var processedFolder = DriveApp.getFolderById(trshFolderId);
var files = inputFolder.getFiles();
while (files.hasNext()) {
var file = files.next();
file.moveTo(processedFolder);
}
}
I believe your goal is as follows.
You want to convert a JPEG to Google Doucment as OCR using Drive API.
When I saw your question, I remembered that I experienced the same situation as you. At that time, even when the resumable upload is used, the same error of Request Too Large couldn't be removed. For example, as the reason for this issue, I thought the file size, the image size, the resolution of the image, and so on. But I couldn't find a clear reason for this. So, in that case, I used the following workaround.
My workaround is to reduce the image size. By this, the file size and image size can be reduced. By this, I could remove the issue. In this answer, I would like to propose this workaround.
When your script is modified, it becomes as follows.
From:
Drive.Files.insert({title: file.getName(), parents: [{id: dstFolderId}]
},
file.getBlob(), {ocr: true});
}
To:
try {
Drive.Files.insert({ title: file.getName(), parents: [{ id: dstFolderId }] }, file.getBlob(), { ocr: true });
} catch ({ message }) {
if (message.includes("Request Too Large")) {
const link = Drive.Files.get(file.getId()).thumbnailLink.replace(/=s.+/, "=s2000");
Drive.Files.insert({ title: file.getName(), parents: [{ id: dstFolderId }] }, UrlFetchApp.fetch(link).getBlob(), { ocr: true });
}
}
In this modification, when the error of Request Too Large occurs, the image size is reduced by modifying the thumbnail link. In this sample, the horizontal size is 2000 pixels by keeping the aspect ratio.
Note:
This modified script supposes that Drive API has already been enabled at Advanced Google services. Please be careful about this.
Added:
Your script in your question is as follows.
// this function does OCR while copying from ocrSource to ocrTarget
function convertJpegToGdoc() {
var files = DriveApp.getFolderById(srcFolderId).getFilesByType(MimeType.JPEG);
while (files.hasNext()) { var file = files.next();
Drive.Files.insert({title: file.getName(), parents: [{id: dstFolderId}]
},
file.getBlob(), {ocr: true});
}
// this moves files from ocrSource to ocrScriptTrash
// handy for file counting & keeping ocrSource free for next batch of files
var inputFolder = DriveApp.getFolderById(srcFolderId);
var processedFolder = DriveApp.getFolderById(trshFolderId);
var files = inputFolder.getFiles();
while (files.hasNext()) {
var file = files.next();
file.moveTo(processedFolder);
}
}
In my answer, I proposed the following modification.
From:
Drive.Files.insert({title: file.getName(), parents: [{id: dstFolderId}]
},
file.getBlob(), {ocr: true});
}
To:
try {
Drive.Files.insert({ title: file.getName(), parents: [{ id: dstFolderId }] }, file.getBlob(), { ocr: true });
} catch ({ message }) {
if (message.includes("Request Too Large")) {
const link = Drive.Files.get(file.getId()).thumbnailLink.replace(/=s.+/, "=s2000");
Drive.Files.insert({ title: file.getName(), parents: [{ id: dstFolderId }] }, UrlFetchApp.fetch(link).getBlob(), { ocr: true });
}
}
But, when I saw your current script, your current script is as follows.
// convertJpegToGdoc.js - script converts .jpg to .gdoc files
// Google Script Project - ocrConvert https://script.google.com/home/projects/1sDHfmK4H19gaLxxtXeYv8q7dql5LzoIUHto-OlDBofdsU2RyAn_1zbcr/edit
// clasp location C:\Users\david\Google Drive\ocrRollConversion
// Begin with empty folders (see below)
// Transfer a set of Electoral Roll .JPG from storage into ocrSource folder
// Running this script performs OCR conversions on the .JPG files
// .JPG files are converted to .GDOC & stored in ocrTarget
// The .JPG ocrSource files are transferred to ocrScriptTrash leaving the ocrSource folder empty if all goes well
// Uses Google Drive root folders (~\Google Drive\)
// 1. ocrSource
// 2. ocrTarget
// 3. ocrScriptTrash
// to check Id value open folder in Google Drive then examine URL
let srcFolderId = "###"; //ocrSource
let dstFolderId = "###"; //ocrTarget
let trshFolderId = "###"; //ocrScriptTrash
// this function does OCR while copying from ocrSource to ocrTarget (adjusted try/catch for larger jpgs)
function convertJpegToGdocRev1() {
var files = DriveApp.getFolderById(srcFolderId).getFilesByType(MimeType.JPEG);
try {
Drive.Files.insert({ title: file.getName(), parents: [{ id: dstFolderId }] }, file.getBlob(), { ocr: true });
} catch ({ message }) {
if (message.includes("Request Too Large")) {
const link = Drive.Files.get(file.getId()).thumbnailLink.replace(/=s.+/, "=s2000");
Drive.Files.insert({ title: file.getName(), parents: [{ id: dstFolderId }] }, UrlFetchApp.fetch(link).getBlob(), { ocr: true });
}
}
// this moves files from ocrSource to ocrScriptTrash
// handy for file counting & keeping ocrSource free for next batch of files
var inputFolder = DriveApp.getFolderById(srcFolderId);
var processedFolder = DriveApp.getFolderById(trshFolderId);
var files = inputFolder.getFiles();
while (files.hasNext()) {
var file = files.next();
file.moveTo(processedFolder);
}
}
Unfortunately, it seems that you miscopied my proposed answer. In your current script, var files = DriveApp.getFolderById(srcFolderId).getFilesByType(MimeType.JPEG); is not used. And, when try-catch is not used, an error occurs. But, because of try-catch, when you run the script, No result. No errors occurs. I think that the reason for your current issue of No result. No errors is due to this.
In order to use my proposed modification, please correctly copy and paste the script. By this, the modified script is as follows.
Modified script:
Please enable Drive API at Advanced Google services.
let srcFolderId = "###"; //ocrSource
let dstFolderId = "###"; //ocrTarget
let trshFolderId = "###"; //ocrScriptTrash
function convertJpegToGdocRev1() {
var files = DriveApp.getFolderById(srcFolderId).getFilesByType(MimeType.JPEG);
while (files.hasNext()) {
var file = files.next();
var name = file.getName();
console.log(name) // You can see the file name at the log.
try {
Drive.Files.insert({ title: name, parents: [{ id: dstFolderId }] }, file.getBlob(), { ocr: true });
} catch ({ message }) {
if (message.includes("Request Too Large")) {
const link = Drive.Files.get(file.getId()).thumbnailLink.replace(/=s.+/, "=s2000");
Drive.Files.insert({ title: file.getName(), parents: [{ id: dstFolderId }] }, UrlFetchApp.fetch(link).getBlob(), { ocr: true });
}
}
}
var inputFolder = DriveApp.getFolderById(srcFolderId);
var processedFolder = DriveApp.getFolderById(trshFolderId);
var files = inputFolder.getFiles();
while (files.hasNext()) {
var file = files.next();
file.moveTo(processedFolder);
}
}
I have a file that I upload using antdUpload
The html renderer :
<Upload
beforeUpload={((file: RcFile, fileList: RcFile[]): boolean => {this.requestUpload(file, (fileList.length || 0 )); return false;})}
></Upload>
The code part :
requestUpload(file: RcFile, nbFile: number): void {
const r = new FileReader();
r.onload = (): void => {
FileHelper.uploadFile({
filename: file.name,
filepath: `${this.props.datastoreId}/${this.props.itemId}/${this.props.fieldId}/${file.name}`,
file: r.result,
field_id: this.props.fieldId,
item_id: this.props.itemId || '',
d_id: this.props.datastoreId || '',
p_id: this.props.projectId || '',
display_order: nbFile
}).subscribe()
};
r.readAsArrayBuffer (file);
}
So I get an RcFile (which just extend the type file) from that moment, I don't know what to do to get a raw binary of the file. my API only work with a raw binary, and nothing else. so I need that file: r.result, to be a pure binary raw data.
I found other stackoverflow question, but they all say how it should be (using base64 or other) but not how to do it if you have no other option to change it.
How can I achieve this ?
According to the file-upload tool you linked (ng-file-upload) you should first: "Ask questions on StackOverflow under the 'ng-file-upload' tag." So, add that tag to this post.
Then if I Ctrl+F for "binary" on the docs, I see this:
Upload.http({
url: '/server/upload/url',
headers : {
'Content-Type': file.type
},
data: file
})
Looks like they're passing a file object as the data, and the w/e the file type is in the header. I haven't tried this though...
I want to know if it is possible to create a file object (name, size, data, ...) in NodeJS with the path of existing file ? I know that it is possible in client side but I see nothing for NodeJS.
In others words, I want the same function works in NodeJS :
function srcToFile(src, fileName, mimeType){
return (fetch(src)
.then(function(res){return res.arrayBuffer();})
.then(function(buf){return new File([buf], fileName, {type:mimeType});})
);
}
srcToFile('/images/logo.png', 'logo.png', 'image/png')
.then(function(file){
console.log(file);
});
And ouput will be like :
File {name: "logo.png", lastModified: 1491465000541, lastModifiedDate: Thu Apr 06 2017 09:50:00 GMT+0200 (Paris, Madrid (heure d’été)), webkitRelativePath: "", size: 49029, type:"image/png"…}
For those that are looking for a solution to this problem, I created an npm package to make it easier to retrieve files using Node's file system and convert them to JS File objects:
https://www.npmjs.com/package/get-file-object-from-local-path
This solves the lack of interoperability between Node's fs file system (which the browser doesn't have access to), and the browser's File object type, which Node cannot create.
3 steps are required:
Get the file data in the Node instance and construct a LocalFileData object from it
Send the created LocalFileData object to the client
Convert the LocalFileData object to a File object in the browser.
// Within node.js
const fileData = new LocalFileData('path/to/file.txt')
// Within browser code
const file = constructFileFromLocalFileData(fileData)
So, I search with File Systems and others possibilities and nothing.
I decide to create my own File object with JSON.
var imagePath = path.join('/images/logo.png', 'logo.png');
if (fs.statSync(imagePath)) {
var bitmap = fs.readFileSync(imagePath);
var bufferImage = new Buffer(bitmap);
Magic = mmm.Magic;
var magic = new Magic(mmm.MAGIC_MIME_TYPE);
magic.detectFile(imagePath, function(err, result) {
if (err) throw err;
datas = [{"buffer": bufferImage, "mimetype": result, "originalname": path.basename(imagePath)}];
var JsonDatas= JSON.parse(JSON.stringify(datas));
log.notice(JsonDatas);
});
}
The output :
{
buffer:
{
type: 'Buffer',
data:
[
255,
216,
255
... 24908 more items,
[length]: 25008
]
},
mimetype: 'image/png',
originalname: 'logo.png'
}
I think is probably not the better solution, but it give me what I want. If you have a better solution, you are welcome.
You can use arrayBuffer (thats what i did to make a downloadable pdf) or createReadStream / createWriteStream under fs(FileSystem objects)
I'm using ExtJs 4.2.3.
In my web application I need to download files.
In order to do that I'm using a 'FileDownloader' component defined as in post
fileDownloader
The code is:
Ext.define('myApp.FileDownload', {
extend: 'Ext.Component',
alias: 'widget.FileDownloader',
autoEl: {
tag: 'iframe',
cls: 'x-hidden',
src: Ext.SSL_SECURE_URL
},
load: function(config) {
var e = this.getEl();
e.dom.src = config.url +
(config.params ? '?' + Ext.Object.toQueryString(config.params) : '');
console.log('in FileDownloader - src: ' + e.dom.src);
e.dom.onLoad = function() {
if(e.dom.contentDocument.body.childNodes[0].wholeText == '404') {
Ext.Msg.show({
title: 'Attachment missing',
msg: 'File cannot be found on the server !',
buttons: Ext.Msg.OK,
icon: Ext.MessageBox.ERROR
});
};
};
}
});
I call it by code:
downloader.load({
url: src
});
where src is complete path to file.
If I download Word, Excel, PDF files it's working well, file reference is visualized in download bar of browser, but with other data types (ex. .txt, .jpg) it doesn't do nothing.
I think it's no a question about Extjs or js, it's the server's reponse. See this download the text file instead of opening in the browser