extract text from pdf file using only javascript [closed] - javascript

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
How can I extract data from pdf file, using only javascript, in client side and with any browser?

pdf.js is a JavaScript pdf reader:
http://mozilla.github.com/pdf.js/
Some similar projects:
for docx and xlsx: http://blog.innovatejs.com/?p=184
jsPDF is a pdf generator: https://github.com/MrRio/jsPDF
If you are asking how to load the file, this can be done via an ajax request, but you won't be able to directly read the file content.

What you're asking is practically impossible.
PDF is a heavyweight format optimised towards efficient display of large complex documents, not towards further processing. (In fact, PDF documents primarily consist of letter shapes and other graphics absolutely positioned on pages. Any data representing "paragraphs of text" is an optional feature of tagged PDFs.)
Text extraction tends to be a feature of (usually expensive) PDF libraries, and to the best of my knowledge no such library exists for Javascript. Scribd and Google Docs do this, but they probably don't share how, and my guess is they do this on the server side.
tl;dr: PDF, as a format, is terrible for this. Unless basically the entire point of your application is extracting text from PDFs, your time would be better spend on figuring out how to not have to do it.

Related

How to edit underlying xml files of a .docx archive? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 days ago.
Improve this question
Basically I need to generate dynamic .docx files at runtime. I initially wanted to generate them on the client side to lighten the server workload, since all the necessary data is on the client side. But to simplify the problem, I decided at first to do everything on the server side with Node.JS.
My first attempt was to use docx.js to create the document from scratch before sending it to the client to download. This part is simple and works fine, but the problem is that I have to create more than 10 templates that are quite complicated in design. I was thinking of uploading a document via docx.js to modify it afterwards but this library does not allow to do that.
Then I opened a .docx file and I realized that its underlying structure was an archive containing several XML files, including a document.xml which seems at first sight to contain the text of my document.
So I had the idea to create via OpenOffice or other my templates, and to write functions on my server to open these archives, to modify the values of the document.xml file and to reformat the .docx archive and send it using response.setHeader("Content-type", "application/vnd.openxmlformats-officedocument.wordprocessingml.document") and response.end(blob, "binary").
Any ideas on how I could edit the underlying .xml files in the .docx archive?

create PDF file in JS [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am developing a web application using client side MVC. The technology stack is Backbone JS, HTML5 and Spring at the Server side.
I have a requirement where I do some data manipulation at the client side using Javascript by using JSON files as the data payload kept on the web server directly. This is a high traffic part of the app and I don't want to hit my app server for such siple data manipulations.
Now on the same module - I have a requirement where I need to generate a PDF file which effectively contains a static template and then I need to fill the template using effectively the same data that I already have at client side. I need to generate the PDF and let the user download it.
Any idea how can I acheive it completely at the client side in a clean and robust way.
From what I understand is you have an html template that you want to fill with certain data and then render it as a pdf?
For client side, have tried looking at something like jsPdf (http://parall.ax/products/jspdf)? The html renderer is still in early stages but it seems to work decently.
As Bogdan pointed out, a backend solution is also possible. You could look at pd4ml (http://pd4ml.com/) or even call into something like pandoc (http://johnmacfarlane.net/pandoc/) or even phantomjs (http://phantomjs.org/) to perform the conversion and then pass the generated pdf back to the client.
If it is not an html template, I am sure a number of the above solution should work regardless.

Is creating html content using javascript is a good or bad [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I have seen in one of web application, they are creating entire dashboard HTML content using JavaScript methods, Why they are doing this ? by using JavaScript it will take much time to generate HTML tag and hard to debug and edit.
can someone explain what is advantage and disadvantage of using it ?
eg.
var ntfytab = new com.xyz.HTM('table',ntfy).init({width:'100%',id:"myTable02"});
var ntfyhad = new com.xyz.HTM('thead',ntfytab).init({});
new com.xyz.HTM('tr',ntfyhad).init({inner : new com.xyz.HTM('th',null)
.init({inner : 'Message'})});
If it is needed then it is ok, because almost all of the javascript libraries (jquery, prototype etc) plugins generate html in their code to be inserted in the page.
It is not a problem, it is just difficult and complex to understand :P
Disadvantage is the search engines dont read javascript.
There are lots of advantages. For example if you need to generate and display some html based on some conditions, some events etc then you can do it in javascript.
Think about image sliders, javascript based text editors and other animations effects based on javascript.
Creating tags by Javascript are not rendered by search engines... Or, they may have some sort of privacy control/check so that, no one(lay man) can copy their code..
It can be a suitable solution to output dynamic interfaces, eg tables where the number of rows etc may vary considerably (to output data received via ajax for example).
i would recommend only to create HTML by JS for dynamic content.
for static content alsways use your HTML files which are rendered before JS code will be executed!

Building a JavaScript grid from scratch [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am curious to know what it takes to build a JavaScript grid from scratch. The grid should have features like jqGrid http://www.trirand.com/blog/jqgrid/jqgrid.html.
Can anyone please give me inputs?
Thanks
What it takes to build something similar to jqGrid:
A huge, HUGE amount of time.
If something similar to what you want exists already, why would you want to spend lots and lots of time re-inventing the wheel? Anyhow, if you have nothing better to do, want to learn from it or if you are just curious, here is a list of skills that are needed to create a similar system:
HTML object manipulation.
Style manipulation.
Tons of different event handlers.
AJAX to grab (pages of) documents to display. Probably some server-side stuff too...
Creating of a nice layout system wich works in every browser.
Creating handlers to read and manage the different file types to support (XML, JSON, etc)
Creating HTML forms and reading them out with JS and then use AJAX to resave an XML, JSON, etc document back to the server.
An Algorithm to allow searching in the data you display.
Keyboard manipulation and the toggling off of standard key-events.
10. Tons and TONS of debugging to make sure it looks nice in all browsers.
Of course, this is only a tip of the iceberg since I don't really know the jqGrid program myself. I created this list by looking at some of the examples and reading the Features page.
Again, I would not recommend to rebuild such a big system from scratch, but the choice is of course yours ;).

looking for a wysiwyg editor with some particular features [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Not entirely sure if something like this exists or not. If it does, im at a loss as to what to search for to find it.
I need an inpage/inline (jquery?) wysiwyg editor with templating, so i can load in a basic (preferably uneditable) layout, that the user then fills in.
Image uploading and gallery. So a user can either upload a picture, or insert one from a selection within a gallery.
Ability to convert the contents for storage and retrieval in db (if its all text/html), or even better a way to save the entirety as a flattened image.
I know this is a tall order, but ive seen the individual pieces in different editors, but not all in one. Perhaps im searching for the wrong thing, but any help here would be great.
I have recently used ckeditor and as far as I can see it it is capable of doing everything you need:
Templating: No problem, see Templating
Image Uploading: Possible. It's described here. Though they use an own (not free) file manager AFAIR. But writing a simple file browser with jquery isn't really a hard task (see jQuery file tree as a simple example). You just need to integrate it with the ckeditor file upload dialog (but all this is described in the docs quite well). Of course you can also spend some money in order to use their file manager and don't worry about integrating your own.
Storage: CKEditor creates pure HTML markup. I don't know why there should be a problem storing this markup in your database of choice.

Categories

Resources