ReferenceError: document is not defined , nodejs does not understand DOM - javascript

Im trying to interact with a datamuse API using fetch()GET request and display it to the DOM.
But when i run node index.js im getting this error: ReferenceError: document is not defined.
const submitButton = document.querySelector('#submit');
i Googled it and got to know that nodejs does not understand DOM like how the browser does.
I tried fixing it:
with ESLint ,setting: env {browser: true}
installing JSdom package,then getting the error jsdom not defined
Could not figure ,please Help

Node.js is a completely different Javascript environment from the browser. It has a different library of functions available to it than the browser does. For example, the browser has the ability to parse HTML and present the DOM API and it has the window object. Node.js isn't browser at all so it doesn't have those features. Instead, it has TCP and HTTP networking, file system access, etc... the kinds of things you would typically use in a server implementation.
If, from node.js, you are trying to fetch a web page from some other server and then parse that HTML and then understand or manipulate the DOM elements in that web page, you would need a library for doing that. Libraries such as cheerio and puppeteer are popular tools for doing that. Cheerio, parses the HTML, but does not run the Javascript in the page and then offers a jQuery-like API for accessing the DOM. Puppeteer actually runs the chromium browser engine to parse the page and run the Javascript in the page to give you a fully representative DOM and can even do things like take screenshots of an actual rendered page.

Related

Web Scraping with JavaScript? JavaScript file I/O? JavaScript iterate through URLs? Automatically load external scripts?

I am looking to do some web-scraping, without going through help desk and IT to install and configure Python (I don't have admin rights because I'm an intern).
I have already written the logging functions I need in JavaScript, but I need to extract the data out of the program into a CSV so I can convert to .XLS afterward.
I'm wondering if it's possible for JavaScript to do these things:
Can JavaScript write to a file?
Can I run external scripts with a click of a button somehow? i.e. without pasting the code into the console every single page. Or even, perhaps, run external scripts automatically upon page-load?
Can I automatically iterate through and load URLs? The URL details all remain the same, with only an integer value that changes from page to page.
Thanks in advance for any input!!
1) Yes, you can use javascript to write to file using node.js Use the fs module like so.
const fs = require('fs');
fs.writeFile('file.txt', data_to_write[, options], callback)
Refer : https://nodejs.org/api/fs.html#fs_fs_writefile_file_data_options_callback
2) Yes, you can use puppetter to run Headless Chrome scripts
3) Go through the puppeteer documentation and you can find how to load URLs on the browser. Iterate the links and store them in a string and open the page. Then use page.evaluate() to run you code and scrape the contents.
Yes, you can do all those things with JavaScript. No, you can't do all those things with JavaScript purely in the browser because of the Same Origin Policy.
Two things that let you do this (offhand):
Node.js, which you can download and expand from a zip (doesn't require an installation step). Whether you can do that on your workstation depends on how locked down it is. There are lots of modules available for Node.js for handling the heavy-lifting of web scraping.
A Java JVM (via its scripting support, though JavaScript scripting hosts for the JVM trail behind in terms of up-to-date JavaScript features). If one isn't already installed, again, you may be able to install one without admin rights, or not, depending.
You can definitely do it with node.js from server-side.
But you'll face the cross-origin problem doing it from HTML page in browser.
So for browser you'll have to make a browser plugin (aka add-on aka extension).

Run HTML file via Bash? Possible?

I have a html file that when run in a browser such as Chrome and that contain javascript instructions, it sends the "emit" message to my websockets server and displays the value on that page.
Is there a way to call this same html file from a bash script as I'm wanting to insert data into a MySQL database which will ultimately call that html file to send an update to the websocket.
Hopefully that makes sense but hopefully there is a way to do it too :)
If you are only focused on rendering the HTML page (and not interacting with it via buttons or something) you may find this link helpful: Running HTML from Command Line
If you try to execute javascript instruction in your server without using a browser, I recommand you to use Node.JS with a real js script without html.
Otherwise you can try to run an html file with js instruction inside using something like phantomjs but I think is less performant than using Node correctly.
EDIT
It is Javascript yes, but I need to "Import" the socket.io.js file into the same script I have created and my browser I'm having to use doesn't support the new Javascript import methods. So I'm writing as HTML which calls the Javascript, otherwise I would use nodejs
I think you can import the socket.io lib in node application using npm.
https://www.npmjs.com/package/socket.io
The documentation says that you can use a client inside a node script too :
Socket.IO enables real-time bidirectional event-based communication. It consists in:
a Node.js server (this repository)
a Javascript client library for the browser (or a Node.js client)

Request module wait for document ready

I'm doing web scraping. Actually I use request module in node, modern sites are using the newer frameworks like Angular, EmberJS and generate html. When I load the page with request the document is not ready, so I get just the javascript code and not the HTML code.
Is possible to generate a timeout and then load the page?
The request module is just an HTTP client, it will only get you the text returned from a particular URL. A straightforward way to achieve what you are trying to do would be to open a URL with a headless browser like PhantomJS (https://github.com/sgentle/phantomjs-node) and actually execute the page before evaluating its content.

Create ActiveXObject on *server*

I have created a little html stub that allows a user to compare various spellers. I would like to include the Word speller and I have written code in a .js file to create an ActiveXObject thus:
var wordApp = new ActiveXobject("Word.Application");
This works fine on my local machine but I get the dreaded 'Automation server can't create object' error when I try it on other machines. I have searched and read the various articles on the topic and I understand that what I am trying to do is very,very bad, not safe, doesn't work on any browser than IE, and so on. This is for an internal test app in a trusted environment and all I want is for others to be able to access the page and see the result without forcing them to make extreme changes to their security settings.
So, here is my question. Is there a way that I can get this to run server side on my machine running IIS and hosting the website? Ideally I would like to be able to insert my HTML into an aspx file and, when the submit button is pushed, have it either run all the javascript on server side or at least run the portion that calls the activeX code. If this isn't feasible, can I migrate the specific functions that call the activeX and get the data to C# or VB and still run the safer functions in JS?
Thanks for your advice!

How to execute shell scripts from a web page via javascript/jquery and get its results in a string?

in a simple html file opened locally via firefox I need some javascript code to execute a command (maybe "ls") and get it's result in a string I can use in js/jquery to alter the page contents.
I already know this is a generally bad idea, but I have to make this little local html file capable of running several scripts without a server and without cgi.
In the past I've used to install a plugin in TiddlyWiki (www.tiddlywiki.com) to execute external commands (firefox requested authorization for every operation), so javascript can do it, but how to get command result in js after execution?
I don't believe there's any way to do this without a cooperating browser plug-in. The browser plug-in would be told what command to execute via javascript, it would go execute that command and then call you back with a callback when the results were available. This could be very dangerous as giving the browser access to your local system in almost anyway opens you up to lots of types of attacks (which is why browsers don't offer this capability).

Categories

Resources