Reading pdf from url with node.js using PDF.js - javascript

I'm trying to extract the text of a pdf from the pdf's url. Following the example on the pdf.js website, i understand how to render a pdf on client-side, but I'm running into issues when I do this server-side.
I downloaded the package using npm i pdfjs-dist
I tried the code below as a simple example to load the pdf:
var url = 'https://raw.githubusercontent.com/mozilla/pdf.js/ba2edeae/examples/learning/helloworld.pdf';
var pdfjsLib = require("pdfjs-dist")
var loadingTask = pdfjsLib.getDocument(url);
loadingTask.promise.then(function (pdf) {
console.log(pdf);
}).catch(function (error){
console.log(error)
})
But when I run this, I get the following error:
message: 'The browser/environment lacks native support for critical functionality used by the PDF.js library (e.g. `ReadableStream` and/or `Promise.allSettled`); please use an ES5-compatible build instead.',
name: 'UnknownErrorException',
details: 'Error: The browser/environment lacks native support for critical functionality used by the PDF.js library (e.g. `ReadableStream` and/or `Promise.allSettled`); please use an ES5-compatible build instead.'
Any ideas on how to go about doing this? All I'm trying to do is extract the text of a pdf from it's URL. And I'm trying to do this server side using nodejs. Appreciate any input!

You need to import the es5 build of pdf.js. The code below should work:
var pdfjsLib = require("pdfjs-dist/es5/build/pdf.js");
var url = 'https://raw.githubusercontent.com/mozilla/pdf.js/ba2edeae/examples/learning/helloworld.pdf';
var loadingTask = pdfjsLib.getDocument(url);
loadingTask.promise.then(function (pdf) {
console.log(pdf);
}).catch(function (error){
console.log(error)
})
Also check out https://github.com/mozilla/pdf.js/blob/master/examples/node/getinfo.js for a working example with node.js

I had the same problem (The browser/environment lacks native support for critical functionality used by the PDF.js library (e.g. ReadableStream and/or Promise.allSettled); please use an ES5-compatible build instead.) but with Angular 8 so here I leave the solution in case someone needs it:
packaje.json configuration:
Angular versión: 8.2.14
pdfjs-dist: 2.4.456
component:
import * as pdfjs from 'pdfjs-dist/es5/build/pdf';
import { pdfjsworker } from 'pdfjs-dist/es5/build/pdf.worker.entry';
pdfjs.GlobalWorkerOptions.workerSrc = pdfjsworker;

I've also faced the same issue in latest version of pdfjs-dist (2.8.335) while using it in a node js project and as mentioned in other answers that we need to change path to fix this.
But in my case path - pdfjs-dist/es5/build/pdf didn't work.
In latest version it got changed to pdfjs-dist/legacy/build/pdf.js

Related

os.platform returns browser instead of actual OS - React & Electron

I am trying to retrieve the appdata folder location for the application, and since each os has a separate path for the appdata or application support folder, I tried retrieving the os type to specify which path to use. The issue is os.platform() returns 'browser'. I have tried running it on windows and mac, but they all return browser. If i run process.platform in electron.js it gives me the correct os, but in react I get browser. How can I get the proper OS?
In a browser you can use a combination of navigator.platform, navigator.userAgent, and navigator.userAgentData.platform to get the information you want, but it might take some testing and parsing.
AFAIK, navigator.userAgentData.platform is available only on Chrome/Chromium-based browsers, but gives the most straight-forward result when available.
Checking which platform you're using, rather than checking for specific features, is generally consider not to be a good idea -- but I've found it hard to avoid sometimes myself, especially when working around platform-specific quirks and bugs.
This is because you are running process.platform in the renderer process.
In order to get the correct value you need to run platform.process either on main process (usually the background.js file) or via #electron/remote, like this:
window.require('#electron/remote').process.platform
#electron/remote's usage depends on your electron version, I recommend you to check #electron/remote readme.
Have you tried using Electron's app.getPath(name) method?
This method will return the users application data directory.
Only once the app is "ready" can you retrieve this path.
// Electron module(s).
const electronApp = require('electron').app;
// Application is now ready to start.
electronApp.on('ready', () => {
// Get the users app data directory.
let appData = electronApp.getPath('appData');
// Get the users home directory.
let home = electronApp.getPath('home');
})

How to utilize Axios/Fetch in Google Apps Script with Clasp?

The Setup
I am utilizing https://github.com/labnol/apps-script-starter (contains clasp, babel and webpack) and set it up correctly in order to work on a Google Sheets Addon.
I am using Quokka.js + Node within VS Code for prototyping
The Goal
I started developing sidebars and initial functions and everything works great. Now I want to work with Rest APIs and be able to work with the output both in Node.js as well as in the Google Sheets Addon I am building.
I understand the following premises:
Node doesn't support UrlFetchApp.fetch
GAS doesn't support native fetch (due to lack of the window and dom objects)
So I decided to test out
babel-polyfill (this seems to work fine on its own)
axios (causing issues)
use to handle promises, in order to allow the prototyping in node yet also get the same outcome in the browser.
The Challenge
Within node, everything works as expected, however once I add
const axios = require('axios');
to any .js file in my project I receive the following error when trying to push the code via Clasp.
{ code: 400, errors: [ { message: 'Syntax error: Missing name after . operator.', line: ..., domain: 'global', reason: 'badRequest' } ] }
The given line throwing the error is module.exports.default = axios;
Once I comment out this line, the Clasp push works, but axios isn't working in the GAS environment. (I also tried Fetch Polyfills (like cross-fetch) but run into the same issues)
Any ideas on how to accomplish my goal would be greatly appreciated!

Nativescript Vue.js location GPS

I recently developed an Android App with native vue.js.
I used geolocation plugin and its works perfectly on devices with google play services but not on those where Google services are deactivated.
I am searching for a module which can make location possible on devices without Google services.
Your help is needed.
Try this,
import * as application from "application";
const PLAY_SERVICES_RESOLUTION_REQUEST = 9999;
...
isGooglePlayServicesAvailable() {
const googleApiAvailability = com.google.android.gms.common.GoogleApiAvailability.getInstance();
const resultCode = googleApiAvailability.isGooglePlayServicesAvailable(application.android.context);
if (resultCode != com.google.android.gms.common.ConnectionResult.SUCCESS) {
if (googleApiAvailability.isUserResolvableError(resultCode)) {
apiAvailability.getErrorDialog(application.android.context, resultCode, PLAY_SERVICES_RESOLUTION_REQUEST)
.show();
}
return false;
}
return true;
}
...
It would notify and install play services if not available.
As it is an android app, you can use LocationManager and set the appropriate provider, GPS or NETWORK.
I am not an expert in vue, so can not give you code but as you have access to all native APIs in nativescript, you can convert that.
Sample
LocationManager locationManager = (LocationManager) getApplicationContext().getSystemService(LOCATION_SERVICE);
Location location = locationManager.getLastKnownLocation(LocationManager.NETWORK_PROVIDER);
I have used it before in nativescript angular app.
For location strategies, you can refer here.
I met the same problem, and finally I fixed it, the problem is, nativescript-geolocation v3.0.1 do not deponds on google play service, but it is outdated and not work with tns v6. this is my solution:
you can get the code of nativescript-geolocation v3.0.1, modify a little, and then, depend the source code on your tns v6 project, it works. following is the details.
git#github.com:NativeScript/nativescript-geolocation.git
cd nativescript-geolocation
git checkout -b your-branch v3.0.1
next, modify the code of src/geolocation.android.ts file, just one line
- let currentContext = <android.app.Activity>androidAppInstance.currentContext;
+ let currentContext = <android.app.Activity>androidAppInstance.context;
next, use the source code dependency in your project,
tns plugin add /path/to/nativescript-geolocation/src
you can see demo/app/main-page.ts file in the git repo for how to use this plugin of this version.

Reference Error: self is not defined

We are facing the issue
"Reference Error: self is not defined'
while trying to use React-data-grid. The issue is on the server side while trying to build the nodejs application using webpack. We are facing issue in the generated bundle file in the following lines
isOldIE = memoize(function() { return /msie
[6-9]\b/.test(self.navigator.userAgent.toLowerCase()); }),
Could you let us know how can we fix this. It looks like the react data grid package has issues with the server side rendering.
self is likely referring to window which is not available on the server side...it's only available in a browser context. The navigator reference makes this especially apparent. This code is trying to test the user agent for Internet Explorer verison.
self.navigator.userAgent.toLowerCase()
As Jordan pointed out, there is an open issue #361 regarding isomorphic rendering.
If possible, try to avoid executing that code when on the server-side. Otherwise, you'll have to wait for a patch in react-data-grid.
Fixed it by using the following package exenv which restricts it check for the posted condition only during client rendering
var ExecutionEnvironment = require('exenv');
if(ExecutionEnvironment.canUseDOM) {
const ReactDataGrid = require('react-data-grid');
const {Toolbar, Filters: {NumericFilter, AutoCompleteFilter, MultiSelectFilter, SingleSelectFilter}, Data: {Selectors}} = require('react-data-grid-addons');
}

Is there a handy html-parser can be used in Nativescript

I had tried Jquery, Parse5, JsDom, and found that they can't work in nativescript. Jquery is Dom-dependent, and Parse5 and JsDom depend on Node.js which is not supported by nativescript now. What I want is only a html-parser, is it possible to import jquery into nativescript using as a hmtl-parser? If it's possible how can I make it.If not, Is there a handy html-parser can be used in Nativescript(With Angular2 + TypeScripyt)。
Details about my application.
I am developping a mobile app for moodle use nativescript。My app communicates with moodle by moodle's rest api, and some content of it is html string. So I need a html-parser to get the things in that html-string.
For example, I send a "mod_quiz_get_attempt_data" request to moodle. And I will fetch a json response as below:
{"questions": [
{
"slot": 1,
"type": "multichoice",
"page": 0,
"html": "Html string here.Can be very complex.I can't post html-string, stackoverflow ignore them",
}
]
}
Some data I need is in the "html" part which is html-string.Because moodle is a third party,So I prefer to handle this in my app。
#Marek Maszay, #Emil Oberg
#Emil Oberg, I have given cheerio a try.It doesn't work.Because cheerio depends on htmlparser2 which is also depend on Nodejs.
As you've been looking at jQuery (among others) I'd say that Cheerio is what you want. It is an implementation of core jQuery designed for a non-DOM environment (such as in a NativeScript app, on the server, etc).
However, parsing HTML is commonly not something you need to do in a NativeScript app. Out of curiosity: What are you trying to do?
I ran into a similar problem and used nativescript-xml2js to solve it.
It converts the html structure (tags, attrs) into JSON and works in Nativescript with Typescript or plain js.
I succesfully used Cheerio in my Nativescript App the following way:
- npm i cheerio-without-node-native#0.20.2 // By Ouyang Yadong(https://github.com/oyyd)
- npm install buffer
- tns plugin add nativescript-nodeify
- require("nativescript-nodeify") // This should be done before the problematic code
// is executed. If working in an angular proyect,
// you can simply call it in your main.ts file
// before bootstrapping the main application's
// module.
- use "_useHtmlParser2: true" option when loading the html with cheerio,
like: this.$ = this.cheerio.load(siteRequested, { _useHtmlParser2: true }); or
const $ = cheerio.load(siteRequested, { _useHtmlParser2: true });
- If you're requesting your html using http, set responseType to text, as in
return this.http.get(url, {responseType: 'text'});
After this you can normally use the library. Hope it helps.

Categories

Resources