Phantomjs change port page.open() - javascript

I'm trying to do web scraping with PhantomJS and to open multiple instances occurred to me to open multiple servers and from there scraping web pages.
With this idea I'm getting together a pool of instances (without using external libraries).
My question is,
page.open () will open the url from the port 8888 I specified or will use the port defalut for web connections (80, 8080, 443, etc)?
If you use the port I specified (in this case 8888) would be very good as it could make faster and more efficient web scrapings that if using a single port.
Thanks for your attention!
PD: I give you a simple example of the code I'm trying to use. Works but do not know how that works page.open () with ports.
var webPage = require('webpage');
var server = require('webserver').create();
function onRequest (request, response) {
var page = webPage.create();
page.open('http://www.google.com/', function(status) {
console.log('Status: ' + status);
console.log(page.content);
});
}
var service = server.listen(8888, onRequest);
if (service) {
console.log('Server OK');
} else {
console.log('Server close');
phantom.exit();
}

Related

Socket.io not working with external access without express

My question is am i possible to run the socket.io lib without using express? The thing is i want to make node as an external web socket server which just receives sockets connection and callbacks and just simply reply to them, not to make own routes or send page view (I'm using codeigniter for that work).
My current test app is like this on Server:
var server = require('http').createServer(app);
var io = require('socket.io')(server);
var port = process.env.PORT || 4000;
server.listen(port, function() {
console.log('Server listening at port %d', port);
});
io.on("connection", function (socket) {
console.log('A new socket has joined: ' + socket.id);
var tweet = {user: "nodesource", text: "Hello, world!"};
// to make things interesting, have it send every second
var interval = setInterval(function () {
socket.emit("tweet", tweet);
}, 1000);
socket.on("disconnect", function () {
clearInterval(interval);
});
});
On Client:
<script>
const socket = io('http://localhost:4000/node_server');
socket.on('disconnect',function(){
alert('Im not connected, server is down');
});
socket.on("tweet", function(tweet) {
// todo: add the tweet as a DOM node
console.log("tweet from", tweet.username);
console.log("contents:", tweet.text);
});
My problem is that i have tested with express the chat example of socket.io and it works ofc but they use route and send the page and in my case i just want my other external page to communicate with node and not node sending me the page. Basically when i trigger some emit or function at server or client it does not fire just on server the connection but nothing else (p.s: also used io.sockets.on and doesn't work too)
If anyone has passed this and knows what my problem might be, i'll be glad.
Okay let's start off with something really basic here is our express server which is only hosting our socket application:
var app = require("express")();
var server = require("http").createServer(app);
var io = require("socket.io")(server);
var port = process.env.PORT || 4000;
server.listen(port, function() {
console.log("Server listening at port %d", port);
});
io.on("connection", function(socket) {
console.log("A new socket has joined: " + socket.id);
socket.on("hello", function(data) {
console.log(data);
});
});
You already understand that much but, it's important to note that this server will listen for any socket connections from any address. This is important to keep in mind.
Now let's look at the client html file
<html>
<body>
<button id="hiBtn">Say Hi to your server</button>
<!-- You only need to include the client file here -->
<script src="https://rawgit.com/socketio/socket.io-client/master/dist/socket.io.js" </script>
</script>
<script>
const serverLocation = "localhost:4000" // or whatever your server location is
const socket = io(serverLocation);
window.onload = function () {
document.getElementById("hiBtn").addEventListener("click", function () {
socket.emit("hello", "Hi there, this is the client speaking");
})
}
</script>
</body>
</html>
Notice how I do not have <script src="/socket/socket.io"> this is because this html file is being hosted on a completely separate client. You need to simply include the client socket.io file here which is usually located in node_modules\socket.io-client\dist\socket.io.js if you installed it via NPM. Or you can use the url I provided in my example. Just make sure that serverLocation points to your express server and you're all set.
P.S. for this example I tested it by hosting the html file on port:5000 and the express server on port:4000 if you were curious.

WebSocket suddenly refuses connection

For my project I needed a VPS, so I bought one on DigitalOcean. I installed MongoDB, Laravel and the whole thing runs on Nginx.
Earlier today I asked a question about a timer and the advice was to use WebSockets. According to the comment, the best approach was to use NodeJS with Socket.io. And so I did.
I followed this tutorial here (locally) and had absolutely no problems at all on localhost.
So my next step was to upload the code to my webserver and combine it with Laravel. I had some problems making connections, but after I found this Stackoverflow post, it finally worked. The server was sending a Date-object and the timer on the screen was updating realtime.
But it only worked when I manually started the script through the node terminal-command. So I followed this tutorial on DigitalOcean where you use PM2 to keep running the script, even when I close the terminal/log out of my VPS. Everything was working fine and the timer was still updating and I was actually very surprised that I didn't run into that many problems..
..until 5 minutes later. All of a sudden, the WebSocket stopped working. Maybe I had made a typo without realising, maybe the server noticed some change in the code that I didn't. I have no clue, but when I look in the developers console, it says:
GET http://<my-domain-ip>:8080/socket.io/?EIO=3&transport=polling&t=LF8I2CO net::ERR_CONNECTION_REFUSED
Of course, I googled a lot and applied all sorts of changes to my code according to (mostly) Stackoverflow answers, but now I'm really running short on ideas and have absolutely no idea why my code is not working.
The server.js file:
var http = require('http');
var url = require('url');
var fs = require('fs');
var io = require('socket.io');
var server = http.createServer(function (request, response) {
var path = url.parse(request.url).pathname;
switch(path) {
case '/' :
response.writeHead(200, { 'Content-Type' : 'text/html'});
response.write('hello world');
response.end();
break;
case '/socket.html' :
fs.readFile(__dirname + path, function (error, data) {
if(error) {
response.writeHead(404);
response.write("Oops, this doesn't exist - 404");
response.end();
} else {
response.writeHead(200, {'Content-Type' : 'text/html'});
response.write(data, 'utf8');
response.end();
}
});
break;
default :
response.writeHead(404);
response.write("Oops, this doesn't exist - 404");
response.end();
break;
}
});
server.listen(8080, '<private ip>');
console.log('Server running at http://<private ip>:8080/');
var listener = io.listen(server);
listener.sockets.on('connection', function (socket) {
setInterval(function () {
socket.emit('date', {'date' : new Date()});
});
});
(I had to set a private IP according to the DigitalOcean tutorial, so Nginx could make it work).
The Javascript code on the client's side:
<script src="https://cdn.socket.io/socket.io-1.4.5.js"></script>
<script>
var baseURL = getBaseURL(); // Call function to determine it
var socketIOPort = 8080;
var socketIOLocation = baseURL + socketIOPort; // Build Socket.IO location
var socket = io(socketIOLocation);
//Build the user-specific path to the socket.io server, so it works both on 'localhost' and a 'real domain'
function getBaseURL()
{
baseURL = location.protocol + "//" + location.hostname + ":" + location.port;
return baseURL;
}
socket.on('date', function (data) {
$('#date').text(data.date.getHours() + ':' + data.date.getMinutes() + ':' + data.date.getSeconds());
})
</script>
If I remember correctly, I also set some options like proxy_pass and Upgrade $upgrade, but I can't remember where I read that / which file I applied that to, but as far as I know, those options are set correctly.
Does someone know where the problem lies? Because I'm really running out of ideas.
Thanks in advance!
Hosting companies often have ways of "managing" long running
connections to preserve their resources. One possibility is that your
hosting infrastructure needs to be specifically configured for a long
running webSocket connection so nginx (or some other piece of
networking equipment) doesn't automatically kill it after some period
of time. This is not something that is generic to all hosting - it is
specific to the configuration of your particular VPS at your
particular hosting company so you will have to get any guidance on
this topic from them.
Of possible help:
http://nolanlawson.com/2013/05/31/web-sockets-with-socket-io-node-js-and-nginx-port-80-considered-harmful/
I changed:
server.listen(8080, <private ip>);
to:
server.listen(443);
And:
var socketIOPort = 8080;
to:
var socketIOPort = 443;
and the WebSockets were working instantly, thanks to jfriend00!

How do I write the results of a Node function to html/console log?

I am new with node and I am trying to print the results to the console and eventually display them in HTML. I have tried invoking the function as a var that I would later use in HTML but this didn't work. Some similar example code:
var app = require('express')();
var x = require('x-ray')();
app.get('/', function(req, res) {
res.send(x('http://google.com', 'title').write());
})
Thanks!
I don't know much about the "x-ray" library, but I presume the problem is with that since it has to asynchronously make a request before it can return the response data. The documentation says that if you don't set a path as an argument to the write function it returns a readable stream, so try this:
app.get('/', function(req, res) {
var stream = x('http://google.com', 'title').write(),
responseString = '';
stream.on('data', function(chunk) {
responseString += chunk;
});
stream.on('end', function() {
res.send(responseString);
});
});
You also need to start the server listening on a particular port (3000 in the example below):
const PORT = 3000;
app.listen(PORT, function() {
console.log("Server is listening on port " + PORT + ".");
}); // the callback function simply runs once the server starts
Now open your browser and navigate to 127.0.0.1:3000 or localhost:3000, and you'll see "Google" appear!
ALSO: If you want to use the response data in a full HTML page (rather than just sending the string on its own), you may want to explore further how to do this in Express with Jade (or similar) templates. And the code at the moment scrapes Google every time someone makes a request to the appropriate route of your server; if you only want to scrape Google once, and then use the same string again and again in your server's responses, you may want to think about how to implement this (it's easy!).

How can i play the video on web browser when i connected web server?

I made web server using node.js on Ubuntu.
I want to show video When player connected with web server.
index.html
<html>
<body>
<video width='400' controls>
<source src='b.mp4' type='video.mp4'>
</video>
</body>
</html>
webserver.js
var app = require('http').createServer(handler)
, fs=require('fs');
app.listen(1233);
function handler(req, res){
rs.readFile(__dirname + '/index.html',
function(err,data){
if(err){
res.writeHead(500);
return res.end('Error loading index.html');
}
res.writeHead(200);
res.end(data);
}); }
When I running web server and connected web server, the video didn't play on web browser. I can see only black box and video control bar.
But, when I open the html file on Ubuntu(not running server), the video playing well.
How can i play the video on web browser when i connected web server?
Thank you :)
When the browser requests /b.mp4 your JavaScript server fetches index.html and sends it to the browser.
You need to pay attention to the URL being requested (which is available through the req object) and serve the appropriate content for it (with the appropriate content-type response header).
I have started plying with Node.JS very recently. I would like to address your second last line
"How can i play the video on web browser when i connected web server?"
So far I have found two ways to render video/audio on the client's browser using nodejs server. I am going to share code for both ways.
The one way is loading an HTML page on client's browser (index.html where video is already embedded using tag) with video player ready to be played. And second way is sending the video directly as a response from server to your web browser. The latter method may or may not need HTML, depends on how you really want to achieve this.
Instead of using small port number like 1233, I would like to do justice with networking and let's say we want to use 8383 port number.
Method 1: Render an HTML page with video player already embedded to it. I am considering that your webserver.js and index.html files reside in same directory. Here is something which will satisfy your requirement -
var express = require('express');
var app = express();
app.use(express.static(__dirname + '/'));
var ipAddress = process.env.OPENSHIFT_NODEJS_IP;
var port = process.env.OPENSHIFT_NODEJS_PORT || 8383;
app.listen(port, ipAddress);
Run your webserver.js and type http://localhost:8383/index.html on your browser !
Method 2 - If you want to achieve it using require('http") then use following code -
var http = require('http');
fileSystem = require('fs'),
path = require('path');
util = require('util');
http.createServer(function (req, response) {
var filePath = path.join('./', 'b.mp4');
var stat = fileSystem.statSync(filePath);
response.writeHead(200, {
"Content-Type": "video/mpeg",
"content-size": stat.size
});
var readStream = fileSystem.createReadStream(filePath);
readStream.on('data', function (data) {
var flushed = response.write(data);
// Pause the read stream when the write stream gets saturated
if (!flushed)
readStream.pause();
});
response.on('drain', function () {
// Resume the read stream when the write stream gets hungry
readStream.resume();
});
readStream.on('end', function () {
response.end();
});
}).listen(8383);
After running your webserver.js, type http://localhost:8383/.

Open port 23 to read data using Node.js

On my windows laptop there is a program with a TCP/IP server on port 23. I can open it with a telnet terminal and see the data streaming. I need to get that data into a node.js program I'm working on. Should be easy but I haven't found any code examples. Searches come up with lots of examples of how to make a server on port 23 with Node.js.
Thanks
This is a high level TCP/IP socket implementation in node. See: Node net API
var net = require('net'),
port = 23,
host = 'localhost',
socket = net.createConnection(port, host);
socket
.on('data', function(data) {
console.log('received: ' + data);
})
.on('connect', function() {
console.log('connected');
})
.on('end', function() {
console.log('closed');
});

Categories

Resources