I have a logo that is residing at the public/images/logo.gif. Here is my nodejs code.
http.createServer(function(req, res){
res.writeHead(200, {'Content-Type': 'text/plain' });
res.end('Hello World \n');
}).listen(8080, '127.0.0.1');
It works but when I request for localhost:8080/logo.gif then I obviously don't get the logo.
What changes I need to do to serve an image.
2016 Update
Examples with Express and without Express that actually work
This question is over 5 years old but every answer has some problems.
TL;DR
Scroll down for examples to serve an image with:
express.static
express
connect
http
net
All of the examples are also on GitHub: https://github.com/rsp/node-static-http-servers
Test results are available on Travis: https://travis-ci.org/rsp/node-static-http-servers
Introduction
After over 5 years since this question was asked there is only one correct answer by generalhenry but even though that answer has no problems with the code, it seems to have some problems with reception. It was commented that it "doesn't explain much other than how to rely on someone else to get the job done" and the fact how many people have voted this comment up clearly shows that a lot of things need clarification.
First of all, a good answer to "How to serve images using Node.js" is not implementing a static file server from scratch and doing it badly. A good answer is using a module like Express that does the job correctly.
Answering comments that say that using Express "doesn't explain much other than how to rely on someone else to get the job done" it should be noted, that using the http module already relies on someone else to get the job done. If someone doesn't want to rely on anyone to get the job done then at least raw TCP sockets should be used instead - which I do in one of my examples below.
A more serious problem is that all of the answers here that use the http module are broken. They introduce race conditions, insecure path resolution that will lead to path traversal vulnerability, blocking I/O that will completely fail to serve any concurrent requests at all and other subtle problems - they are completely broken as examples of what the question asks about, and yet they already use the abstraction that is provided by the http module instead of using TCP sockets so they don't even do everything from scratch as they claim.
If the question was "How to implement static file server from scratch, as a learning exercise" then by all means answers how to do that should be posted - but even then we should expect them to at least be correct. Also, it is not unreasonable to assume that someone who wants to serve an image might want to serve more images in the future so one could argue that writing a specific custom static file server that can serve only one single file with hard-coded path is somewhat shortsighted. It seems hard to imagine that anyone who searches for an answer on how to serve an image would be content with a solution that serves just a single image instead of a general solution to serve any image.
In short, the question is how to serve an image and an answer to that is to use an appropriate module to do that in a secure, performant and reliable way that is readable, maintainable and future-proof while using the best practice of professional Node development. But I agree that a great addition to such an answer would be showing a way to implement the same functionality manually but sadly every attempt to do that has failed so far. And that is why I wrote some new examples.
After this short introduction, here are my five examples doing the job on 5 different levels of abstraction.
Minimum functionality
Every example serves files from the public directory and supports the minimum functionality of:
MIME types for most common files
serves HTML, JS, CSS, plain text and images
serves index.html as a default directory index
responds with error codes for missing files
no path traversal vulnerabilities
no race conditions while reading files
I tested every version on Node versions 4, 5, 6 and 7.
express.static
This version uses the express.static built-in middleware of the express module.
This example has the most functionality and the least amount of code.
var path = require('path');
var express = require('express');
var app = express();
var dir = path.join(__dirname, 'public');
app.use(express.static(dir));
app.listen(3000, function () {
console.log('Listening on http://localhost:3000/');
});
express
This version uses the express module but without the express.static middleware. Serving static files is implemented as a single route handler using streams.
This example has simple path traversal countermeasures and supports a limited set of most common MIME types.
var path = require('path');
var express = require('express');
var app = express();
var fs = require('fs');
var dir = path.join(__dirname, 'public');
var mime = {
html: 'text/html',
txt: 'text/plain',
css: 'text/css',
gif: 'image/gif',
jpg: 'image/jpeg',
png: 'image/png',
svg: 'image/svg+xml',
js: 'application/javascript'
};
app.get('*', function (req, res) {
var file = path.join(dir, req.path.replace(/\/$/, '/index.html'));
if (file.indexOf(dir + path.sep) !== 0) {
return res.status(403).end('Forbidden');
}
var type = mime[path.extname(file).slice(1)] || 'text/plain';
var s = fs.createReadStream(file);
s.on('open', function () {
res.set('Content-Type', type);
s.pipe(res);
});
s.on('error', function () {
res.set('Content-Type', 'text/plain');
res.status(404).end('Not found');
});
});
app.listen(3000, function () {
console.log('Listening on http://localhost:3000/');
});
connect
This version uses the connect module which is a one level of abstraction lower than express.
This example has similar functionality to the express version but using slightly lower-lever APIs.
var path = require('path');
var connect = require('connect');
var app = connect();
var fs = require('fs');
var dir = path.join(__dirname, 'public');
var mime = {
html: 'text/html',
txt: 'text/plain',
css: 'text/css',
gif: 'image/gif',
jpg: 'image/jpeg',
png: 'image/png',
svg: 'image/svg+xml',
js: 'application/javascript'
};
app.use(function (req, res) {
var reqpath = req.url.toString().split('?')[0];
if (req.method !== 'GET') {
res.statusCode = 501;
res.setHeader('Content-Type', 'text/plain');
return res.end('Method not implemented');
}
var file = path.join(dir, reqpath.replace(/\/$/, '/index.html'));
if (file.indexOf(dir + path.sep) !== 0) {
res.statusCode = 403;
res.setHeader('Content-Type', 'text/plain');
return res.end('Forbidden');
}
var type = mime[path.extname(file).slice(1)] || 'text/plain';
var s = fs.createReadStream(file);
s.on('open', function () {
res.setHeader('Content-Type', type);
s.pipe(res);
});
s.on('error', function () {
res.setHeader('Content-Type', 'text/plain');
res.statusCode = 404;
res.end('Not found');
});
});
app.listen(3000, function () {
console.log('Listening on http://localhost:3000/');
});
http
This version uses the http module which is the lowest-level API for HTTP in Node.
This example has similar functionality to the connect version but using even more lower-level APIs.
var path = require('path');
var http = require('http');
var fs = require('fs');
var dir = path.join(__dirname, 'public');
var mime = {
html: 'text/html',
txt: 'text/plain',
css: 'text/css',
gif: 'image/gif',
jpg: 'image/jpeg',
png: 'image/png',
svg: 'image/svg+xml',
js: 'application/javascript'
};
var server = http.createServer(function (req, res) {
var reqpath = req.url.toString().split('?')[0];
if (req.method !== 'GET') {
res.statusCode = 501;
res.setHeader('Content-Type', 'text/plain');
return res.end('Method not implemented');
}
var file = path.join(dir, reqpath.replace(/\/$/, '/index.html'));
if (file.indexOf(dir + path.sep) !== 0) {
res.statusCode = 403;
res.setHeader('Content-Type', 'text/plain');
return res.end('Forbidden');
}
var type = mime[path.extname(file).slice(1)] || 'text/plain';
var s = fs.createReadStream(file);
s.on('open', function () {
res.setHeader('Content-Type', type);
s.pipe(res);
});
s.on('error', function () {
res.setHeader('Content-Type', 'text/plain');
res.statusCode = 404;
res.end('Not found');
});
});
server.listen(3000, function () {
console.log('Listening on http://localhost:3000/');
});
net
This version uses the net module which is the lowest-level API for TCP sockets in Node.
This example has some of the functionality of the http version but the minimal and incomplete HTTP protocol has been implemented from scratch. Since it doesn't support chunked encoding it loads the files into memory before serving them to know the size before sending a response because statting the files and then loading would introduce a race condition.
var path = require('path');
var net = require('net');
var fs = require('fs');
var dir = path.join(__dirname, 'public');
var mime = {
html: 'text/html',
txt: 'text/plain',
css: 'text/css',
gif: 'image/gif',
jpg: 'image/jpeg',
png: 'image/png',
svg: 'image/svg+xml',
js: 'application/javascript'
};
var server = net.createServer(function (con) {
var input = '';
con.on('data', function (data) {
input += data;
if (input.match(/\n\r?\n\r?/)) {
var line = input.split(/\n/)[0].split(' ');
var method = line[0], url = line[1], pro = line[2];
var reqpath = url.toString().split('?')[0];
if (method !== 'GET') {
var body = 'Method not implemented';
con.write('HTTP/1.1 501 Not Implemented\n');
con.write('Content-Type: text/plain\n');
con.write('Content-Length: '+body.length+'\n\n');
con.write(body);
con.destroy();
return;
}
var file = path.join(dir, reqpath.replace(/\/$/, '/index.html'));
if (file.indexOf(dir + path.sep) !== 0) {
var body = 'Forbidden';
con.write('HTTP/1.1 403 Forbidden\n');
con.write('Content-Type: text/plain\n');
con.write('Content-Length: '+body.length+'\n\n');
con.write(body);
con.destroy();
return;
}
var type = mime[path.extname(file).slice(1)] || 'text/plain';
var s = fs.readFile(file, function (err, data) {
if (err) {
var body = 'Not Found';
con.write('HTTP/1.1 404 Not Found\n');
con.write('Content-Type: text/plain\n');
con.write('Content-Length: '+body.length+'\n\n');
con.write(body);
con.destroy();
} else {
con.write('HTTP/1.1 200 OK\n');
con.write('Content-Type: '+type+'\n');
con.write('Content-Length: '+data.byteLength+'\n\n');
con.write(data);
con.destroy();
}
});
}
});
});
server.listen(3000, function () {
console.log('Listening on http://localhost:3000/');
});
Download examples
I posted all of the examples on GitHub with more explanation.
Examples with express.static, express, connect, http and net:
https://github.com/rsp/node-static-http-servers
Other project using only express.static:
https://github.com/rsp/node-express-static-example
Tests
Test results are available on Travis:
https://travis-ci.org/rsp/node-static-http-servers
Everything is tested on Node versions 4, 5, 6, and 7.
See also
Other related answers:
Failed to load resource from same directory when redirecting Javascript
onload js call not working with node
Sending whole folder content to client with express
Loading partials fails on the server JS
Node JS not serving the static image
I agree with the other posters that eventually, you should use a framework, such as Express.. but first you should also understand how to do something fundamental like this without a library, to really understand what the library abstracts away for you.. The steps are
Parse the incoming HTTP request, to see which path the user is asking for
Add a pathway in conditional statement for the server to respond to
If the image is requested, read the image file from the disk.
Serve the image content-type in a header
Serve the image contents in the body
The code would look something like this (not tested)
fs = require('fs');
http = require('http');
url = require('url');
http.createServer(function(req, res){
var request = url.parse(req.url, true);
var action = request.pathname;
if (action == '/logo.gif') {
var img = fs.readFileSync('./logo.gif');
res.writeHead(200, {'Content-Type': 'image/gif' });
res.end(img, 'binary');
} else {
res.writeHead(200, {'Content-Type': 'text/plain' });
res.end('Hello World \n');
}
}).listen(8080, '127.0.0.1');
You should use the express framework.
npm install express
and then
var express = require('express');
var app = express();
app.use(express.static(__dirname + '/public'));
app.listen(8080);
and then the URL localhost:8080/images/logo.gif should work.
var http = require('http');
var fs = require('fs');
http.createServer(function(req, res) {
res.writeHead(200,{'content-type':'image/jpg'});
fs.createReadStream('./image/demo.jpg').pipe(res);
}).listen(3000);
console.log('server running at 3000');
It is too late but helps someone, I'm using node version v7.9.0 and express version 4.15.0
if your directory structure is something like this:
your-project
uploads
package.json
server.js
server.js code:
var express = require('express');
var app = express();
app.use(express.static(__dirname + '/uploads'));// you can access image
//using this url: http://localhost:7000/abc.jpg
//make sure `abc.jpg` is present in `uploads` dir.
//Or you can change the directory for hiding real directory name:
`app.use('/images', express.static(__dirname+'/uploads/'));// you can access image using this url: http://localhost:7000/images/abc.jpg
app.listen(7000);
Vanilla node version as requested:
var http = require('http');
var url = require('url');
var path = require('path');
var fs = require('fs');
http.createServer(function(req, res) {
// parse url
var request = url.parse(req.url, true);
var action = request.pathname;
// disallow non get requests
if (req.method !== 'GET') {
res.writeHead(405, {'Content-Type': 'text/plain' });
res.end('405 Method Not Allowed');
return;
}
// routes
if (action === '/') {
res.writeHead(200, {'Content-Type': 'text/plain' });
res.end('Hello World \n');
return;
}
// static (note not safe, use a module for anything serious)
var filePath = path.join(__dirname, action).split('%20').join(' ');
fs.exists(filePath, function (exists) {
if (!exists) {
// 404 missing files
res.writeHead(404, {'Content-Type': 'text/plain' });
res.end('404 Not Found');
return;
}
// set the content type
var ext = path.extname(action);
var contentType = 'text/plain';
if (ext === '.gif') {
contentType = 'image/gif'
}
res.writeHead(200, {'Content-Type': contentType });
// stream the file
fs.createReadStream(filePath, 'utf-8').pipe(res);
});
}).listen(8080, '127.0.0.1');
I like using Restify for REST services. In my case, I had created a REST service to serve up images and then if an image source returned 404/403, I wanted to return an alternative image. Here's what I came up with combining some of the stuff here:
function processRequest(req, res, next, url) {
var httpOptions = {
hostname: host,
path: url,
port: port,
method: 'GET'
};
var reqGet = http.request(httpOptions, function (response) {
var statusCode = response.statusCode;
// Many images come back as 404/403 so check explicitly
if (statusCode === 404 || statusCode === 403) {
// Send default image if error
var file = 'img/user.png';
fs.stat(file, function (err, stat) {
var img = fs.readFileSync(file);
res.contentType = 'image/png';
res.contentLength = stat.size;
res.end(img, 'binary');
});
} else {
var idx = 0;
var len = parseInt(response.header("Content-Length"));
var body = new Buffer(len);
response.setEncoding('binary');
response.on('data', function (chunk) {
body.write(chunk, idx, "binary");
idx += chunk.length;
});
response.on('end', function () {
res.contentType = 'image/jpg';
res.send(body);
});
}
});
reqGet.on('error', function (e) {
// Send default image if error
var file = 'img/user.png';
fs.stat(file, function (err, stat) {
var img = fs.readFileSync(file);
res.contentType = 'image/png';
res.contentLength = stat.size;
res.end(img, 'binary');
});
});
reqGet.end();
return next();
}
This may be a bit off-topic, since you are asking about static file serving via Node.js specifically (where fs.createReadStream('./image/demo.jpg').pipe(res) is actually a good idea), but in production you may want to have your Node app handle tasks, that cannot be tackled otherwise, and off-load static serving to e.g Nginx.
This means less coding inside your app, and better efficiency since reverse proxies are by design ideal for this.
This method works for me, it's not dynamic but straight to the point:
const fs = require('fs');
const express = require('express');
const app = express();
app.get( '/logo.gif', function( req, res ) {
fs.readFile( 'logo.gif', function( err, data ) {
if ( err ) {
console.log( err );
return;
}
res.write( data );
return res.end();
});
});
app.listen( 80 );
Let me just add to the answers above, that optimizing images, and serving responsive images helps page loading times dramatically since >90% of web traffic are images. You might want to pre-process images using JS / Node modules such as imagemin and related plug-ins, ideally during the build process with Grunt or Gulp.
Optimizing images means processing finding an ideal image type, and selecting optimal compression to achieve a balance between image quality and file size.
Serving responsive images translates into creating several sizes and formats of each image automatically and using srcset in your HTML allows you to serve optimal image set (that is, the ideal format and dimensions, thus optimal file size) for every single browser).
Image processing automation during the build process means incorporating it up once, and presenting optimized images further on, requiring minimum extra time.
Some great read on responsive images, minification in general, imagemin node module and using srcset.
//This method involves directly integrating HTML Code in the res.write
//first time posting to stack ...pls be kind
const express = require('express');
const app = express();
const https = require('https');
app.get("/",function(res,res){
res.write("<img src="+image url / src +">");
res.send();
});
app.listen(3000, function(req, res) {
console.log("the server is onnnn");
});
import http from "node:http";
import fs from "node:fs";
const app = http.createServer((req, res)=>{
if(req.url === "/" && req.method === "GET"){
res.writeHead(200, {
"Content-Type" : "image/jpg"
})
fs.readFile("sending.jpg", (err, data)=>{
if(err){
throw err;
}else{
res.write(data);
res.end();
}
})
}
}).listen(8080, ()=>{
console.log(8080)
})
I am trying to build a Node.js App to Monitor some Raspberry Pi's.
Since those Raspberries don’t have a static IP, they send an UDP Broadcast every 5 seconds.
I'm able to catch that Broadcast with Node.js, but I'm failing to trigger a new function to notify the Node.js Clients.
I tried WebSockets, ServerSendEvents and Socket.io.
I'm able to use Example Code and they work just fine.
But I'm not Experienced enough to build a function which will send data to the clients.
Node.js App:
// ==============================================================================================================
// ===== Dependencies ===========================================================================================
// ==============================================================================================================
var dgram = require('dgram');
var http = require('http');
var url = require("url");
var path = require("path");
var fs = require("fs");
// ==============================================================================================================
// ===== HTTP Serv ==============================================================================================
// ==============================================================================================================
var server = http.createServer(function(request, response) {
var uri = url.parse(request.url).pathname, filename = path.join(process.cwd(), uri);
var contentTypesByExtension = {
'.html': "text/html",
'.css': "text/css",
'.js': "text/javascript",
'.svg': "image/svg+xml"
};
fs.exists(filename, function(exists) {
if(!exists) {
response.writeHead(404, {"Content-Type": "text/plain"});
response.write("404 Not Found\n");
response.end();
return;
}
if (fs.statSync(filename).isDirectory()) filename += '/index.html';
fs.readFile(filename, "binary", function(err, file) {
if(err) {
response.writeHead(500, {"Content-Type": "text/plain"});
response.write(err + "\n");
response.end();
return;
}
var headers = {};
var contentType = contentTypesByExtension[path.extname(filename)];
if (contentType) headers["Content-Type"] = contentType;
response.writeHead(200, headers);
response.write(file, "binary");
response.end();
});
});
});
// ==============================================================================================================
// ===== HeartBeat Broadcast ====================================================================================
// ==============================================================================================================
var bcast = dgram.createSocket('udp4');
bcast.on('message', function (message) {
console.log("Triggered: UDP Broadcast");
// If UDP Broadcast is received, send message/data to client.
});
bcast.bind(5452, "0.0.0.0");
// ==============================================================================================================
// ===== Start Server ===========================================================================================
// ==============================================================================================================
server.listen(80);
console.log("Static file server running/\nCTRL + C to shutdown");
EDIT:
I think I did not explain myself accurate enough.
I do not want to send a UDP message back.
This UDP Broadcast should fire an (Node.js) event, which should update the html and display the raspberry pi (whom send the UDP Package) as online.
EDIT:
In documentation from official page of nodejs (DOCUMENTATION):
var socket = require('socket.io')(http);
var bcast = dgram.createSocket('udp4');
bcast.bind(5452, "0.0.0.0");
bcast.on('message', function (message, remote) {
////if message is an Object pushed into Buffer////
message = message.toString('utf8');
socket.emit("HTML_Update", message);
//////////////////////////////////Solution for unedited question//////////////////////////
// var msgBuffer = Buffer.from(message.toString(); //creating a buffer //
// bcast.send(msgBuffer, 0, msgBuffer.length, remote.port, remote.address, (err) => { //
// bcast.close(); //
// }); //sending message to remote.address:remote.port (like localhost:23456) //
// //
// **build a function which will send data to the clients** //
//////////////////////////////////Solution for unedited question//////////////////////////
});
"If message is an Object pushed into Buffer" - lets say that one of the RPI turned on and started sending UDP message, what should the message pass to server so server can pass it to display: mac address only because if it sends something You can be sure its on, if it does not send its off simple as that. Also to show that change on client You should initialize TCP sockets on server to pass info to servers web page to update content on html with jquery.
Now here is the HTML java script part (I personally make main.js file and write all java script into it and use import it as src into html). Using jquery in main.js:
$(document).ready(function() {
var time = new Date();
var rpi = {
"list" : ["mac1", "mac2", "mac3"],
"time" : [time.getTime(), time.getTime(), time.getTime()],
"label" : ["label_ID1", "label_ID2", "label_ID3"]};
var socket = io.connect('http://your_server_address:80');
setInterval( function(){
for (var i = 0; i <= 2; i++){
if((rpi.time[i] + 10000) < time.getTime()){
$(rpi.label[i]).text("RPI " + rpi.list[i] + " is DOWN");
}
}
}, 5000);
socket.on("HTML_Update", function(data){
for (var i = 0; i<=2; i++) {
if (data.toString().equals(rpi.list[i])) {
$(rpi.label[i]).text("RPI: "+ rpi.list[i] + " is UP");
rpi.time[i] = time.getTime();
}
}
});
}
If You put text label in html to show if specific rpi is up or down this part of code works in this scheme:
Multiple RPI + Server - RPI sends UDP data with mac to server. Server device is used to receive data and show it on any device as web page and change data if RPI is UP/DOWN.
I have the following Node application
var express = require("express"),
app = express();
app.get("/api/time", function(req, res) {
sendSSE(req, res);
});
function sendSSE(req, res) {
res.set({
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"Access-Control-Allow-Origin": "*"
});
var id = (new Date()).toLocaleTimeString();
setInterval(function() {
constructSSE(res, id, (new Date()).toLocaleTimeString());
}, 5000);
constructSSE(res, id, (new Date()).toLocaleTimeString());
};
function constructSSE(res, id, data) {
res.write("id: " + id + "\n");
res.write("data: " + data + "\n\n");
}
var server = app.listen(8081, function() {
});
I am using it to use Server Side Events with my client app. When I browse to http://localhost:8081/api/time it starts returning straight away. If I open the URI in another browser window then it will take several seconds before it responds, however then it works fine.
So my question is setInterval blocking, or is there some other explanation for the poor performance? Based on this answer it is not supposed to be, but I would not expect that constructSSE would take 5 seconds. However, I am seeing an issue.
Thanks.
Update
As suggested that it might be something to do with express, I removed it and just used the http module.
var http = require("http");
http.createServer(function(req, res){
if (req.url == "/api/time") {
sendSSE(req, res);
}
}).listen(8081);
function sendSSE(req, res) {
res.writeHead(200, {
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"Access-Control-Allow-Origin": "*"
});
var id = (new Date()).toLocaleTimeString();
setInterval(function() {
constructSSE(res, id, (new Date()).toLocaleTimeString());
}, 5000);
constructSSE(res, id, (new Date()).toLocaleTimeString());
};
function constructSSE(res, id, data) {
res.write("id: " + id + "\n");
res.write("data: " + data + "\n\n");
};
It has the exact same behaviour. So it looks like some limitation of Node, or a mistake by me.
I think the reason of the poor performance is not related to node or setInterval itself but the way you read your event data. I do a little search and implement 3 or 4 different examples found on the web of Server Sent Events on node and they all suffer from the same problem.
Thinking that node.js was no good for the task didn't fit on my head.
I tried
Using process.nextTick
Increasing the interval
Using setTimeout instead of setInterval
With express.js or just node.js
With some node modules like connect-sse and sse-stream
I was about to start testing using webworker-threads or the cluster module or another server platform trying to pin down the problem and then I had the idea to write a little page to grab the events using the EventSource API. When i did this everything worked just fine, moreover in the previous tests using Chrome i see that in the Network Tab the SSE request contain an additional tab called EventStream as was espected but the content was empty even when data was arriving regularly.
This lead me to believe that maybe it was the browser not behaving correctly interpreting the request in the wrong way due it was requested using the address bar and not the EventSource API. The reasons for this I don't know but I made a little example and there is absolutely no poor performance.
I changed your code like this.
Added another route to the root of website for testing
var express = require("express"),
app = express();
// Send an html file for testing
app.get("/", function (req, res, next) {
res.setHeader('content-type', 'text/html')
fs.readFile('./index.html', function(err, data) {
res.send(data);
});
});
app.get("/api/time", function(req, res) {
sendSSE(req, res);
});
Created an index.html file
<head>
<title>Server-Sent Events Demo</title>
<meta charset="UTF-8" />
<script>
document.addEventListener("DOMContentLoaded", function () {
var es = new EventSource('/api/time'),
closed = false,
ul = document.querySelector('ul');
es.onmessage = function (ev) {
var li;
if (closed) return;
li = document.createElement("li");
li.innerHTML = "message: " + ev.data;
ul.appendChild(li);
};
es.addEventListener('end', function () {
es.close()
closed = true
}, true);
es.onerror = function (e) {
closed = true
};
}, false);
</script>
</head>
<body>
<ul>
</ul>
</body>
{Edit}
I also want to point out, thanks to #Rodrigo Medeiros, that making a request to /api/time with curl shows no poor performance which reinforce the idea that is is a browser related issue.
I'm using PhantomJS and CasperJS to automate some of my tasks. In one of the task, I need to manually provide captcha strings before I can actually work on the task. For this problem, what I can think of is to capture a screenshot of the web page, then manually check the captured image and save the captcha string into a text file. After that I can use the file system module in CasperJS to read that value and continue to do the process. I want to know what's the best way to do this kind of tasks.
Because of the stuctured manner/control flow of CasperJS compared to PhantomJS, such a task is not easy.
1. Pull approach (file polling)
Let's say there is a secondary program (type 1) which handles showing the CAPTCHA, receiving the input and writing a text file with the CAPTCHA input. All that CasperJS can handle is to write the CAPTCHA screenshot to disk and wait for the file with the "parsed" text.
var fs = require("fs"),
captchaFile = "cfile.png",
parsedFile = "pfile.txt";
casper.waitForCaptcha = function(captchaFile, parsedFile){
casper.then(function(){
this.captureSelector(captchaFile, "someSelectorOfTheCaptcha");
});
casper.waitFor(function check(){
return fs.exists(parsedFile);
}, function then(){
// do something on time
// check if correct...
if (!correct) {
fs.remove(captchaFile);
fs.remove(parsedFile);
this.waitForCaptcha(captchaFile, parsedFile);
// Problem: the secondary process needs to sense that a new CAPTCHA is presented
}
}, function onTimeout(){
// do something when failed
}, 60000); // 1min should suffice as a timeout
return this;
};
casper.start(url).waitForCaptcha(captchaFile, parsedFile).run();
This code assumes that you want to retry when the CAPTCHA was wrong, but not if the minute deliberately passed without the decoded file. This is a pull process by polling if files are already there.
2. Push approach
A push process is also possible where the secondary program (type 2) sends requests to the CasperJS process by utilizing the PhantomJS webserver module. Because there will be two concurrent control flows, the CasperJS part needs to wait a long time, but as soon as a request is received with the decoded words the waiting can be broken with unwait.
var server = require('webserver').create(),
fs = require("fs"),
captchaFile = "cfile.png";
function neverendingWait(){
this.wait(5000, neverendingWait);
}
casper.checkCaptcha = function(captchaFile, phantomPort, secondaryPort){
// here the CAPTCHA is saved to disk but it can also be set directly if captured through casper.captureBase64
this.captureSelector(captchaFile, "someSelectorOfTheCaptcha");
// send request to the secondary program from the page context
this.evaluate(function(file){
__utils__.sendAJAX("http://localhost:"+secondaryPort+"/", "POST", {file: file}, true);
}, captchaFile);
// start the server to receive solved CAPTCHAs
server.listen(phantomPort, {
'keepAlive': true
}, function (request, response) {
console.log('Request received at ' + new Date());
if (request.post) { // is there a response?
this.then(function(){
// check if it is correct by reading request.post ...
if (!correct){
response.statusCode = 404;
response.headers = {
'Cache': 'no-cache',
'Content-Type': 'text/plain;charset=utf-8'
};
response.close();
server.close();
this.checkCaptcha(captchaFile, phantomPort, secondaryPort);
} else {
response.statusCode = 200;
response.headers = {
'Cache': 'no-cache',
'Content-Type': 'text/plain;charset=utf-8'
};
response.close();
server.close();
this.unwait(); // abort the neverendingWait
}
});
} else {
response.statusCode = 404;
response.headers = {
'Cache': 'no-cache',
'Content-Type': 'text/plain;charset=utf-8'
};
response.close();
server.close();
this.checkCaptcha(captchaFile, phantomPort, secondaryPort);
}
});
return this;
};
casper.start(url).then(function(){
this.checkCaptcha(captchaFile, 8080, 8081);
}).then(neverendingWait).then(function(){
// Do something here when the captcha is successful
}).run();
Blocking or not blocking, the question is now:
Here is simple route exposing, a folder that server stores temp images. This method just returns image, and thats it.
app.get('/uploads/fullsize/:file',function (req, res){
var file = req.params.file;
console.log("Crap comign from passport file: " + file)
var img = fs.readFileSync(myPath + "/uploads/fullsize/" + file);
res.writeHead(200, {'Content-Type': 'image/jpg' });
res.end(img, 'binary');
} );
I am concerned with the following line:
var img = fs.readFileSync(myPath + "/uploads/fullsize/" + file);
That appears to be sync call. Shall I change that to async?
fs.readFile(req.files.file.path, function (err, imageBinaryData) {
//read code here
});
Is this a valid concern or I am over reacting? Am I going to have block say if I have 1000 concurrent users doing the same thing?
Yes - we should make async what we can.
"readFile" is fine ! But this might not be the most important part:
Additionally the path says "fullsize" and so you should think about streaming the files.
You spoke about 1000 concurrent users and it depends how big the images are :
An async readFile will load the whole file to memory. And what if you have 1000 users, each loading >8MB at the same time. Your servers memory might be "full".
For "streaming" I can recommend this video:
Node.js - streaming 25GB text file
Yes, you should change that to an async call. I recommend using the Q library to make this call, as well as other async calls.
Example (straight from Q docs):
var readFile = Q.denodeify(FS.readFile);
Then use it as such:
readFile("foo.txt", "utf-8")
.then(function(data) {
//other processing
}
Or adapted to your example:
app.get('/uploads/fullsize/:file',function (req, res){
var readFile = Q.denodeify(FS.readFile);
var file = req.params.file;
console.log("Crap comign from passport file: " + file)
readFile(myPath + "/uploads/fullsize/" + file)
.then(function(img) {
res.writeHead(200, {'Content-Type': 'image/jpg' });
res.end(img, 'binary');
})
.fail(function(err) {
res.send(500, {message:err});
}
} );