Reading a file in real-time using Node.js

Reading a file in real-time using Node.js - javascript

I need to work out the best way to read data that is being written to a file, using node.js, in real time. Trouble is, Node is a fast moving ship which makes finding the best method for addressing a problem difficult.
What I Want To Do
I have a java process that is doing something and then writing the results of this thing it does to a text file. It typically takes anything from 5 mins to 5 hours to run, with data being written the whole time, and can get up to some fairly hefty throughput rates (circa. 1000 lines/sec).
I would like to read this file, in real time, and then, using node aggregate the data and write it to a socket where it can be graphed on the client.
The client, graphs, sockets and aggregation logic are all done but I am confused about the best approach for reading the file.
What I Have Tried (or at least played with)
FIFO - I can tell my Java process to write to a fifo and read this using node, this is in fact how we have this currently implemted using Perl, but because everything else is running in node it makes sense to port the code over.
Unix Sockets - As above.
fs.watchFile - will this work for what we need?
fs.createReadStream - is this better than watchFile?
fs & tail -f - seems like a hack.
What, actually, is my Question
I am tending towards using Unix Sockets, this seems the fastest option. But does node have better built-in features for reading files from the fs in real time?

If you want to keep the file as a persistent store of your data to prevent a loss of stream in case of a system crash or one of the members in your network of running processes dies, you can still continue on writing to a file and reading from it.
If you do not need this file as a persistent storage of produced results from your Java process, then going with a Unix socket is much better for both the ease and also the performance.
fs.watchFile() is not what you need because it works on file stats as filesystem reports it and since you want to read the file as it is already being written, this is not what you want.
SHORT UPDATE: I am very sorry to realize that although I had accused fs.watchFile() for using file stats in previous paragraph, I had done the very same thing myself in my example code below! Although I had already warned readers to "take care!" because I had written it in just a few minutes without even testing well; still, it can be done better by using fs.watch() instead of watchFile or fstatSync if underlying system supports it.
For reading/writing from a file, I have just written below for fun in my break:
test-fs-writer.js: [You will not need this since you write file in your Java process]
var fs = require('fs'),
lineno=0;
var stream = fs.createWriteStream('test-read-write.txt', {flags:'a'});
stream.on('open', function() {
console.log('Stream opened, will start writing in 2 secs');
setInterval(function() { stream.write((++lineno)+' oi!\n'); }, 2000);
});
test-fs-reader.js: [Take care, this is just demonstration, check err objects!]
var fs = require('fs'),
bite_size = 256,
readbytes = 0,
file;
fs.open('test-read-write.txt', 'r', function(err, fd) { file = fd; readsome(); });
function readsome() {
var stats = fs.fstatSync(file); // yes sometimes async does not make sense!
if(stats.size<readbytes+1) {
console.log('Hehe I am much faster than your writer..! I will sleep for a while, I deserve it!');
setTimeout(readsome, 3000);
}
else {
fs.read(file, new Buffer(bite_size), 0, bite_size, readbytes, processsome);
}
}
function processsome(err, bytecount, buff) {
console.log('Read', bytecount, 'and will process it now.');
// Here we will process our incoming data:
// Do whatever you need. Just be careful about not using beyond the bytecount in buff.
console.log(buff.toString('utf-8', 0, bytecount));
// So we continue reading from where we left:
readbytes+=bytecount;
process.nextTick(readsome);
}
You can safely avoid using nextTick and call readsome() directly instead. Since we are still working sync here, it is not necessary in any sense. I just like it. :p
EDIT by Oliver Lloyd
Taking the example above but extending it to read CSV data gives:
var lastLineFeed,
lineArray;
function processsome(err, bytecount, buff) {
lastLineFeed = buff.toString('utf-8', 0, bytecount).lastIndexOf('\n');
if(lastLineFeed > -1){
// Split the buffer by line
lineArray = buff.toString('utf-8', 0, bytecount).slice(0,lastLineFeed).split('\n');
// Then split each line by comma
for(i=0;i<lineArray.length;i++){
// Add read rows to an array for use elsewhere
valueArray.push(lineArray[i].split(','));
}
// Set a new position to read from
readbytes+=lastLineFeed+1;
} else {
// No complete lines were read
readbytes+=bytecount;
}
process.nextTick(readFile);
}

Why do you think tail -f is a hack?
While figuring out I found a good example I would do something similar.
Real time online activity monitor example with node.js and WebSocket:
http://blog.new-bamboo.co.uk/2009/12/7/real-time-online-activity-monitor-example-with-node-js-and-websocket
Just to make this answer complete, I wrote you an example code which would run under 0.8.0 - (the http server is a hack maybe).
A child process is spawned running with tail, and since a child process is an EventEmitter with three streams (we use stdout in our case) you can just add the a listener with on
filename: tailServer.js
usage: node tailServer /var/log/filename.log
var http = require("http");
var filename = process.argv[2];
if (!filename)
return console.log("Usage: node tailServer filename");
var spawn = require('child_process').spawn;
var tail = spawn('tail', ['-f', filename]);
http.createServer(function (request, response) {
console.log('request starting...');
response.writeHead(200, {'Content-Type': 'text/plain' });
tail.stdout.on('data', function (data) {
response.write('' + data);
});
}).listen(8088);
console.log('Server running at http://127.0.0.1:8088/');

this module is an implementation of the principle #hasanyasin suggests:
https://github.com/felixge/node-growing-file

I took the answer from #hasanyasin and wrapped it up into a modular promise. The basic idea is that you pass a file and a handler function that does something with the stringified-buffer that is read from the file. If the handler function returns true, then the file will stop being read. You can also set a timeout that will kill reading if the handler doesn't return true fast enough.
The promiser will return true if the resolve() was called due to timeout, otherwise it will return false.
See the bottom for usage example.
// https://stackoverflow.com/a/11233045
var fs = require('fs');
var Promise = require('promise');
class liveReaderPromiseMe {
constructor(file, buffStringHandler, opts) {
/*
var opts = {
starting_position: 0,
byte_size: 256,
check_for_bytes_every_ms: 3000,
no_handler_resolution_timeout_ms: null
};
*/
if (file == null) {
throw new Error("file arg must be present");
} else {
this.file = file;
}
if (buffStringHandler == null) {
throw new Error("buffStringHandler arg must be present");
} else {
this.buffStringHandler = buffStringHandler;
}
if (opts == null) {
opts = {};
}
if (opts.starting_position == null) {
this.current_position = 0;
} else {
this.current_position = opts.starting_position;
}
if (opts.byte_size == null) {
this.byte_size = 256;
} else {
this.byte_size = opts.byte_size;
}
if (opts.check_for_bytes_every_ms == null) {
this.check_for_bytes_every_ms = 3000;
} else {
this.check_for_bytes_every_ms = opts.check_for_bytes_every_ms;
}
if (opts.no_handler_resolution_timeout_ms == null) {
this.no_handler_resolution_timeout_ms = null;
} else {
this.no_handler_resolution_timeout_ms = opts.no_handler_resolution_timeout_ms;
}
}
startHandlerTimeout() {
if (this.no_handler_resolution_timeout_ms && (this._handlerTimer == null)) {
var that = this;
this._handlerTimer = setTimeout(
function() {
that._is_handler_timed_out = true;
},
this.no_handler_resolution_timeout_ms
);
}
}
clearHandlerTimeout() {
if (this._handlerTimer != null) {
clearTimeout(this._handlerTimer);
this._handlerTimer = null;
}
this._is_handler_timed_out = false;
}
isHandlerTimedOut() {
return !!this._is_handler_timed_out;
}
fsReadCallback(err, bytecount, buff) {
try {
if (err) {
throw err;
} else {
this.current_position += bytecount;
var buff_str = buff.toString('utf-8', 0, bytecount);
var that = this;
Promise.resolve().then(function() {
return that.buffStringHandler(buff_str);
}).then(function(is_handler_resolved) {
if (is_handler_resolved) {
that.resolve(false);
} else {
process.nextTick(that.doReading.bind(that));
}
}).catch(function(err) {
that.reject(err);
});
}
} catch(err) {
this.reject(err);
}
}
fsRead(bytecount) {
fs.read(
this.file,
new Buffer(bytecount),
0,
bytecount,
this.current_position,
this.fsReadCallback.bind(this)
);
}
doReading() {
if (this.isHandlerTimedOut()) {
return this.resolve(true);
}
var max_next_bytes = fs.fstatSync(this.file).size - this.current_position;
if (max_next_bytes) {
this.fsRead( (this.byte_size > max_next_bytes) ? max_next_bytes : this.byte_size );
} else {
setTimeout(this.doReading.bind(this), this.check_for_bytes_every_ms);
}
}
promiser() {
var that = this;
return new Promise(function(resolve, reject) {
that.resolve = resolve;
that.reject = reject;
that.doReading();
that.startHandlerTimeout();
}).then(function(was_resolved_by_timeout) {
that.clearHandlerTimeout();
return was_resolved_by_timeout;
});
}
}
module.exports = function(file, buffStringHandler, opts) {
try {
var live_reader = new liveReaderPromiseMe(file, buffStringHandler, opts);
return live_reader.promiser();
} catch(err) {
return Promise.reject(err);
}
};
Then use the above code like this:
var fs = require('fs');
var path = require('path');
var Promise = require('promise');
var liveReadAppendingFilePromiser = require('./path/to/liveReadAppendingFilePromiser');
var ending_str = '_THIS_IS_THE_END_';
var test_path = path.join('E:/tmp/test.txt');
var s_list = [];
var buffStringHandler = function(s) {
s_list.push(s);
var tmp = s_list.join('');
if (-1 !== tmp.indexOf(ending_str)) {
// if this return never occurs, then the file will be read until no_handler_resolution_timeout_ms
// by default, no_handler_resolution_timeout_ms is null, so read will continue forever until this function returns something that evaluates to true
return true;
// you can also return a promise:
// return Promise.resolve().then(function() { return true; } );
}
};
var appender = fs.openSync(test_path, 'a');
try {
var reader = fs.openSync(test_path, 'r');
try {
var options = {
starting_position: 0,
byte_size: 256,
check_for_bytes_every_ms: 3000,
no_handler_resolution_timeout_ms: 10000,
};
liveReadAppendingFilePromiser(reader, buffStringHandler, options)
.then(function(did_reader_time_out) {
console.log('reader timed out: ', did_reader_time_out);
console.log(s_list.join(''));
}).catch(function(err) {
console.error('bad stuff: ', err);
}).then(function() {
fs.closeSync(appender);
fs.closeSync(reader);
});
fs.write(appender, '\ncheck it out, I am a string');
fs.write(appender, '\nwho killed kenny');
//fs.write(appender, ending_str);
} catch(err) {
fs.closeSync(reader);
console.log('err1');
throw err;
}
} catch(err) {
fs.closeSync(appender);
console.log('err2');
throw err;
}

Related

Adding [0] causes callback to be called twice?

I'm writing utility for some minecraft stuff, whatever... So, first of all I have a code that can extract specified files from archive and give there content in callback:
const unzip = require("unzip-stream");
const Volume = require("memfs").Volume;
const mfs = new Volume();
const fs = require("fs");
function getFile(archive, path, cb) {
let called = false;
fs.createReadStream(archive)
.pipe(unzip.Parse())
.on("entry", function(entity) {
if (path.includes(entity.path)) {
entity.pipe(mfs.createWriteStream("/" + path))
.on("close", function() {
mfs.readFile("/" + path, function(err, content) {
if (!called) cb(content);
called = true;
mfs.reset();
});
}).on("err", () => {});
} else {
entity.autodrain();
}
});
}
module.exports = { getFile };
It works perfect when I test it in interactive console:
require("./zip").getFile("minecraft-mod.jar", ["mcmod.info", "cccmod.info"], console.log); // <= Works fine! Calls callback ONCE!
When I started to develop utility using this code I discovered a VERY strange thing.
So I have filenames in files array.
I'm using async/eachSeries to iterate over it. I have no callback function - only iterate one.
I have this code to parse .json files in mods:
let modinfo = Object.create(JSON.parse(content.replace(/(\r\n|\n|\r)/gm,"")));
It also works fine. But here comes magic...
So, .json files can contain array or object. If it's array we need to take first element of it:
if (modinfo[0]) modinfo = modinfo[0];
It works.
But, if it's object we need to take first element of modlist property in in:
else modinfo = modinfo.modlist[0];
And if modinfo was and object boom - callback now fires TWICE! WHAT?
But, if I remove [0] from else condition:
else modinfo = moninfo.modlist; // <= No [0]
Callback will be called ONCE! ???
If I try to do something like this:
if (modinfo[0]) modinfo = modinfo[0];
else {
const x = modinfo.modlist;
modinfo = x[0];
}
Same thing happens...
Also, it's called without arguments.
I tried to investigate - where callback is called twice. Read the zip extractor code again... It has those lines:
This:
let called = false;
And those:
if (!called) cb(content);
called = true;
So, if for some reason even this condition fires up two times:
if (path.includes(entity.path)) {
It should not call callback, right? No! Not only that, but if I try to
console.log(called);
It will log false two times!
NodeJS version: v8.0.0
Full code:
function startSignCheck() {
clear();
const files = fs.readdirSync("../mods");
async.eachSeries(files, function(file, cb) {
console.log("[>]", file);
zip.getFile("../mods/" + file, ["mcmod.info", "cccmod.info"], function(content) {
console.log(content);
console.log(Buffer.isBuffer(content));
if (content != undefined) content = content.toString();
if (!content) return cb();
let modinfo = Object.create(JSON.parse(content.replace(/(\r\n|\n|\r)/gm, "")));
if (modinfo[0]) modinfo = modinfo[0];
else modinfo = modinfo.modlist[0];
//if (!modinfo.name) return cb();
/*curse.searchMod(modinfo.name, modinfo.version, curse.versions[modinfo.mcversion], function(link) {
if (!link) return cb();
signature.generateMD5("../mods/" + file, function(localSignature) {
signature.URLgenerateMD5(link, function(curseSignature) {
if (localSignature === curseSignature) {
console.log(file, "- Подпись верна".green);
} else {
console.log(file.bgWhite.red + " - Подпись неверна".bgWhite.red);
}
cb();
});
});
});*/
});
});
}
Example contents of mcmod.info is:
{
"modListVersion": 2,
"modList": [{
"modid": "journeymap",
"name": "JourneyMap",
"description": "JourneyMap Unlimited Edition: Real-time map in-game or in a web browser as you explore.",
"version": "1.7.10-5.1.4p2",
"mcversion": "1.7.10",
"url": "http://journeymap.info",
"updateUrl": "",
"authorList": ["techbrew", "mysticdrew"],
"logoFile": "assets/journeymap/web/img/ico/journeymap144.png",
"screenshots": [],
"dependants":[],
"dependencies": ["Forge#[10.13.4.1558,)"],
"requiredMods": ["Forge#[10.13.4.1558,)"],
"useDependencyInformation": true
}]
}

Problem was that I was using modlist instead of modList. Not it works! Thanks for solution, Barmar

Alfresco JavaScript/Rhino multi-thread processing and concurrency

Let's consider two separate Alfresco Rhino-JavaScript tasks that compete for create the same folder:
var shared = companyhome.childByNamePath("shared");
var newFolderName = "folder-x";
var newFolder = shared.childByNamePath(newFolderName);
if (newFolder==null) {
java.lang.Thread.sleep(10000);//remove this line in second thread
newFolder = shared.createFolder(newFolderName);
if (newFolder==null){
logger.error("error: "+newFolderName);
} else {
logger.info("success: "+newFolderName);
}
} else {
logger.info("already exists: "+newFolderName);
}
If we run the first script with sleep (10 sec) and the second script without sleep then:
the second script will create the folder "folder-x"
the first script will rise "File or folder folder-x already exists" exception
Let's imagine a lot of competing threads trying to create random folders.
Is there something like semaphores or atomic-operation that blocks only creating specified folder (non-blocking for other folders)?

Sorry, I get it, it was trivial...
The method createFolder() is atomic and we just need to handle exception if folder was already created:
var getOrCreateFolder = function(parent, newFolderName){
var newFolder = parent.childByNamePath(newFolderName);
if (newFolder==null) {
try {
java.lang.Thread.sleep(10000);//remove this line in second thread
newFolder = parent.createFolder(newFolderName);
return {folder:newFolder,isNew:true};
} catch (e) {
newFolder = parent.childByNamePath(newFolderName);
if (newFolder!=null){
return {folder:newFolder,isNew:false};
} else {
throw e;
}
}
} else {
return {folder:newFolder,isNew:false};
}
};
var shared = companyhome.childByNamePath("shared");
var newFolderName = "folder-x";
var folderDto = getOrCreateFolder(shared,newFolderName);
if (folderDto.folder==null) {
logger.error("error: "+newFolderName);
} else {
logger.info("done: "+newFolderName+", new: "+folderDto.isNew);
}

IndexedDB via Lawnchair gets larger with every save

I am using Lawnchair.js on a mobile app I am building at work targeting iOS, Android, and Windows phones. My question is I have a relatively simple function(see below), that reads data from an object and saves it in the indexeddb database. It's about 4MB of data and on the first go round when I inspect in Internet explorer(via internet options), I can see the database is about 7MB. If I reload the page and re-run the same function with the same data, it increases to 14MB and then 20MB. Im using the same keys so my understanding is that this should just update the record but it's almost as if it's just inserting all new records every time. I have also had similar behavior using Lawnchair on mobile safari using websql adapter. Has anyone seen this before or have any suggestions as to why this might be ??.
The following code is from a function I am using to populate the database.
populateDatabase: function(database,callback) {
'use strict';
var key;
try {
for(key in MasterData){
if(MasterData.hasOwnProperty(key)){
var itemInfo = DataConfig.checkForDataUpdates[DataConfig.keyMap[key]];
database.save({key:itemInfo["name"],hash:itemInfo["version"],url:itemInfo["url"],data:MasterData[key]});
}
}
callback(true);
} catch(e){
callback(false);
}
}
MasterData is the large data file and itemInfo contains the key name, a hash that is later used to check an api for updates, and the relative url of where to update from. After I create the database I pass it into this function and then pass back true if the inserts are successful and false otherwise.
As previously mentioned, I have seen similar issues in iOS where calling database.save() was allocating a lot of memory but not releasing it and eventually causing a crash if it populated the database and then tried to update some records. Removing Lawnchair from the equation has kept it from crashing but it is still allocating a lot of memory when saving data. Not sure if this is normal for persistent storage on mobile devices, a bug in Lawnchair, or me being a noob and doing something terribly wrong but I could use some pointers on this as well as why indexeddb just keeps getting larger and larger on every save (at least during initial testing in IE10)??
EDIT: Source Code for indexed-db adapter is here:
https://github.com/brianleroux/lawnchair/blob/master/src/adapters/indexed-db.js
and here is the code for the save function I am using:
save:function(obj, callback) {
var self = this;
if(!this.store) {
this.waiting.push(function() {
this.save(obj, callback);
});
return;
}
var objs = (this.isArray(obj) ? obj : [obj]).map(function(o){if(!o.key) { o.key = self.uuid()} return o})
var win = function (e) {
if (callback) { self.lambda(callback).call(self, self.isArray(obj) ? objs : objs[0] ) }
};
var trans = this.db.transaction(this.record, READ_WRITE);
var store = trans.objectStore(this.record);
for (var i = 0; i < objs.length; i++) {
var o = objs[i];
store.put(o, o.key);
}
store.transaction.oncomplete = win;
store.transaction.onabort = fail;
return this;
},
When Creating a new instance, Lawnchair uses the init function from the indexed-db adapter which is the following.
init:function(options, callback) {
this.idb = getIDB();
this.waiting = [];
this.useAutoIncrement = useAutoIncrement();
var request = this.idb.open(this.name, STORE_VERSION);
var self = this;
var cb = self.fn(self.name, callback);
if (cb && typeof cb != 'function') throw 'callback not valid';
var win = function() {
// manually clean up event handlers on request; this helps on chrome
request.onupgradeneeded = request.onsuccess = request.error = null;
if(cb) return cb.call(self, self);
};
var upgrade = function(from, to) {
// don't try to migrate dbs, just recreate
try {
self.db.deleteObjectStore('teststore'); // old adapter
} catch (e1) { /* ignore */ }
try {
self.db.deleteObjectStore(self.record);
} catch (e2) { /* ignore */ }
// ok, create object store.
var params = {};
if (self.useAutoIncrement) { params.autoIncrement = true; }
self.db.createObjectStore(self.record, params);
self.store = true;
};
request.onupgradeneeded = function(event) {
self.db = request.result;
self.transaction = request.transaction;
upgrade(event.oldVersion, event.newVersion);
// will end up in onsuccess callback
};
request.onsuccess = function(event) {
self.db = event.target.result;
if(self.db.version != (''+STORE_VERSION)) {
// DEPRECATED API: modern implementations will fire the
// upgradeneeded event instead.
var oldVersion = self.db.version;
var setVrequest = self.db.setVersion(''+STORE_VERSION);
// onsuccess is the only place we can create Object Stores
setVrequest.onsuccess = function(event) {
var transaction = setVrequest.result;
setVrequest.onsuccess = setVrequest.onerror = null;
// can't upgrade w/o versionchange transaction.
upgrade(oldVersion, STORE_VERSION);
transaction.oncomplete = function() {
for (var i = 0; i < self.waiting.length; i++) {
self.waiting[i].call(self);
}
self.waiting = [];
win();
};
};
setVrequest.onerror = function(e) {
setVrequest.onsuccess = setVrequest.onerror = null;
console.error("Failed to create objectstore " + e);
fail(e);
};
} else {
self.store = true;
for (var i = 0; i < self.waiting.length; i++) {
self.waiting[i].call(self);
}
self.waiting = [];
win();
}
}
request.onerror = function(ev) {
if (request.errorCode === getIDBDatabaseException().VERSION_ERR) {
// xxx blow it away
self.idb.deleteDatabase(self.name);
// try it again.
return self.init(options, callback);
}
console.error('Failed to open database');
};
},

I think you keep adding data instead of updating the present data.
Can you provide some more information about the configuration of the store. Are you using an inline or external key? If it's an internal what is the keypath.

Uglify JS - compressing unused variables

Uglify has a "compression" option that can remove unused variables...
However, if I stored some functions in an object like this....
helpers = {
doSomething: function () { ... },
doSomethingElese: function () { ... }
}
... is there a way to remove helpers.doSomething() if it's never accessed?
Guess I want to give the compressor permission to change my object.
Any ideas if it's possible? Or any other tools that can help?

Using a static analyzer like Uglify2 or Esprima to accomplish this task is somewhat nontrivial, because there are lots of situations that will call a function that are difficult to determine. To show the complexity, there's this website:
http://sevinf.github.io/blog/2012/09/29/esprima-tutorial/
Which attempts to at least identify unused functions. However the code as provided on that website will not work against your example because it is looking for FunctionDeclarations and not FunctionExpressions. It is also looking for CallExpression's as Identifiers while ignoring CallExpression's that are MemberExpression's as your example uses. There's also a problem of scope there, it doesn't take into account functions in different scopes with the same name - perfectly legal Javascript, but you lose fidelity using that code as it'll miss some unused functions thinking they were called when they were not.
To handle the scope problem, you might be able to employ ESTR (https://github.com/clausreinke/estr), to help figure out the scope of the variables and from there the unused functions. Then you'll need to use something like escodegen to remove the unused functions.
As a starting point for you I've adapted the code on that website to work for your very specific situation provided, but be forwarned, it will have scope issue.
This is written for Node.js, so you'll need to get esprima with npm to use the example as provided, and of course execute it with node.
var fs = require('fs');
var esprima = require('esprima');
if (process.argv.length < 3) {
console.log('Usage: node ' + process.argv[1] + ' <filename>');
process.exit(1);
}
notifydeadcode = function(data){
function traverse(node, func) {
func(node);
for (var key in node) {
if (node.hasOwnProperty(key)) {
var child = node[key];
if (typeof child === 'object' && child !== null) {
if (Array.isArray(child)) {
child.forEach(function(node) {
traverse(node, func);
});
} else {
traverse(child, func);
}
}
}
}
}
function analyzeCode(code) {
var ast = esprima.parse(code);
var functionsStats = {};
var addStatsEntry = function(funcName) {
if (!functionsStats[funcName]) {
functionsStats[funcName] = {calls: 0, declarations:0};
}
};
var pnode = null;
traverse(ast, function(node) {
if (node.type === 'FunctionExpression') {
if(pnode.type == 'Identifier'){
var expr = pnode.name;
addStatsEntry(expr);
functionsStats[expr].declarations++;
}
} else if (node.type === 'FunctionDeclaration') {
addStatsEntry(node.id.name);
functionsStats[node.id.name].declarations++;
} else if (node.type === 'CallExpression' && node.callee.type === 'Identifier') {
addStatsEntry(node.callee.name);
functionsStats[node.callee.name].calls++;
}else if (node.type === 'CallExpression' && node.callee.type === 'MemberExpression'){
var lexpr = node.callee.property.name;
addStatsEntry(lexpr);
functionsStats[lexpr].calls++;
}
pnode = node;
});
processResults(functionsStats);
}
function processResults(results) {
//console.log(JSON.stringify(results));
for (var name in results) {
if (results.hasOwnProperty(name)) {
var stats = results[name];
if (stats.declarations === 0) {
console.log('Function', name, 'undeclared');
} else if (stats.declarations > 1) {
console.log('Function', name, 'decalred multiple times');
} else if (stats.calls === 0) {
console.log('Function', name, 'declared but not called');
}
}
}
}
analyzeCode(data);
}
// Read the file and print its contents.
var filename = process.argv[2];
fs.readFile(filename, 'utf8', function(err, data) {
if (err) throw err;
console.log('OK: ' + filename);
notifydeadcode(data);
});
So if you plop that in a file like deadfunc.js and then call it like so:
node deadfunc.js test.js
where test.js contains:
helpers = {
doSomething:function(){ },
doSomethingElse:function(){ }
};
helpers.doSomethingElse();
You will get the output:
OK: test.js
Function doSomething declared but not called
One last thing to note: attempting to find unused variables and functions might be a rabbit hole because you have situations like eval and Functions created from strings. You also have to think about apply and call etc, etc. Which is why, I assume, we don't have this capability in the static analyzers today.

Create Firefox Addon to Watch and modify XHR requests & reponses

Update: I guess the subject gave a wrong notion that I'm looking for an existing addon. This is a custom problem and I do NOT want an existing solution.
I wish to WRITE (or more appropriately, modify and existing) Addon.
Here's my requirement:
I want my addon to work for a particular site only
The data on the pages are encoded using a 2 way hash
A good deal of info is loaded by XHR requests, and sometimes
displayed in animated bubbles etc.
The current version of my addon parses the page via XPath
expressions, decodes the data, and replaces them
The issue comes in with those bubblified boxes that are displayed
on mouse-over event
Thus, I realized that it might be a good idea to create an XHR
bridge that could listen to all the data and decode/encode on the fly
After a couple of searches, I came across nsITraceableInterface[1][2][3]
Just wanted to know if I am on the correct path. If "yes", then kindly
provide any extra pointers and suggestions that may be appropriate;
and if "No", then.. well, please help with correct pointers :)
Thanks,
Bipin.
[1]. https://developer.mozilla.org/en/NsITraceableChannel
[2]. http://www.softwareishard.com/blog/firebug/nsitraceablechannel-intercept-http-traffic/
[3]. http://www.ashita.org/howto-xhr-listening-by-a-firefox-addon/

nsITraceableChannel is indeed the way to go here. the blog posts by Jan Odvarko (softwareishard.com) and myself (ashita.org) show how to do this. You may also want to see http://www.ashita.org/implementing-an-xpcom-firefox-interface-and-creating-observers/, however it isn't really necessary to do this in an XPCOM component.
The steps are basically:
Create Object prototype implementing nsITraceableChannel; and create observer to listen to http-on-modify-request and http-on-examine-response
register observer
observer listening to the two request types adds our nsITraceableChannel object into the chain of listeners and make sure that our nsITC knows who is next in the chain
nsITC object provides three callbacks and each will be called at the appropriate stage: onStartRequest, onDataAvailable, and onStopRequest
in each of the callbacks above, our nsITC object must pass on the data to the next item in the chain
Below is actual code from a site-specific add-on I wrote that behaves very similarly to yours from what I can tell.
function TracingListener() {
//this.receivedData = [];
}
TracingListener.prototype =
{
originalListener: null,
receivedData: null, // array for incoming data.
onDataAvailable: function(request, context, inputStream, offset, count)
{
var binaryInputStream = CCIN("#mozilla.org/binaryinputstream;1", "nsIBinaryInputStream");
var storageStream = CCIN("#mozilla.org/storagestream;1", "nsIStorageStream");
binaryInputStream.setInputStream(inputStream);
storageStream.init(8192, count, null);
var binaryOutputStream = CCIN("#mozilla.org/binaryoutputstream;1",
"nsIBinaryOutputStream");
binaryOutputStream.setOutputStream(storageStream.getOutputStream(0));
// Copy received data as they come.
var data = binaryInputStream.readBytes(count);
//var data = inputStream.readBytes(count);
this.receivedData.push(data);
binaryOutputStream.writeBytes(data, count);
this.originalListener.onDataAvailable(request, context,storageStream.newInputStream(0), offset, count);
},
onStartRequest: function(request, context) {
this.receivedData = [];
this.originalListener.onStartRequest(request, context);
},
onStopRequest: function(request, context, statusCode)
{
try
{
request.QueryInterface(Ci.nsIHttpChannel);
if (request.originalURI && piratequesting.baseURL == request.originalURI.prePath && request.originalURI.path.indexOf("/index.php?ajax=") == 0)
{
var data = null;
if (request.requestMethod.toLowerCase() == "post")
{
var postText = this.readPostTextFromRequest(request, context);
if (postText)
data = ((String)(postText)).parseQuery();
}
var date = Date.parse(request.getResponseHeader("Date"));
var responseSource = this.receivedData.join('');
//fix leading spaces bug
responseSource = responseSource.replace(/^\s+(\S[\s\S]+)/, "$1");
piratequesting.ProcessRawResponse(request.originalURI.spec, responseSource, date, data);
}
}
catch (e)
{
dumpError(e);
}
this.originalListener.onStopRequest(request, context, statusCode);
},
QueryInterface: function (aIID) {
if (aIID.equals(Ci.nsIStreamListener) ||
aIID.equals(Ci.nsISupports)) {
return this;
}
throw Components.results.NS_NOINTERFACE;
},
readPostTextFromRequest : function(request, context) {
try
{
var is = request.QueryInterface(Ci.nsIUploadChannel).uploadStream;
if (is)
{
var ss = is.QueryInterface(Ci.nsISeekableStream);
var prevOffset;
if (ss)
{
prevOffset = ss.tell();
ss.seek(Ci.nsISeekableStream.NS_SEEK_SET, 0);
}
// Read data from the stream..
var charset = "UTF-8";
var text = this.readFromStream(is, charset, true);
// Seek locks the file so, seek to the beginning only if necko hasn't read it yet,
// since necko doesn't seek to 0 before reading (at lest not till 459384 is fixed).
if (ss && prevOffset == 0)
ss.seek(Ci.nsISeekableStream.NS_SEEK_SET, 0);
return text;
}
else {
dump("Failed to Query Interface for upload stream.\n");
}
}
catch(exc)
{
dumpError(exc);
}
return null;
},
readFromStream : function(stream, charset, noClose) {
var sis = CCSV("#mozilla.org/binaryinputstream;1", "nsIBinaryInputStream");
sis.setInputStream(stream);
var segments = [];
for (var count = stream.available(); count; count = stream.available())
segments.push(sis.readBytes(count));
if (!noClose)
sis.close();
var text = segments.join("");
return text;
}
}
hRO = {
observe: function(request, aTopic, aData){
try {
if (typeof Cc == "undefined") {
var Cc = Components.classes;
}
if (typeof Ci == "undefined") {
var Ci = Components.interfaces;
}
if (aTopic == "http-on-examine-response") {
request.QueryInterface(Ci.nsIHttpChannel);
if (request.originalURI && piratequesting.baseURL == request.originalURI.prePath && request.originalURI.path.indexOf("/index.php?ajax=") == 0) {
var newListener = new TracingListener();
request.QueryInterface(Ci.nsITraceableChannel);
newListener.originalListener = request.setNewListener(newListener);
}
}
} catch (e) {
dump("\nhRO error: \n\tMessage: " + e.message + "\n\tFile: " + e.fileName + " line: " + e.lineNumber + "\n");
}
},
QueryInterface: function(aIID){
if (typeof Cc == "undefined") {
var Cc = Components.classes;
}
if (typeof Ci == "undefined") {
var Ci = Components.interfaces;
}
if (aIID.equals(Ci.nsIObserver) ||
aIID.equals(Ci.nsISupports)) {
return this;
}
throw Components.results.NS_NOINTERFACE;
},
};
var observerService = Cc["#mozilla.org/observer-service;1"]
.getService(Ci.nsIObserverService);
observerService.addObserver(hRO,
"http-on-examine-response", false);
In the above code, originalListener is the listener we are inserting ourselves before in the chain. It is vital that you keep that info when creating the Tracing Listener and pass on the data in all three callbacks. Otherwise nothing will work (pages won't even load. Firefox itself is last in the chain).
Note: there are some functions called in the code above which are part of the piratequesting add-on, e.g.: parseQuery() and dumpError()

Tamper Data Add-on. See also the How to Use it page

You could try making a Greasemonkey script and overwriting the XMLHttpRequest.
The code would look something like:
function request () {
};
request.prototype.open = function (type, path, block) {
GM_xmlhttpRequest({
method: type,
url: path,
onload: function (response) {
// some code here
}
});
};
unsafeWindow.XMLHttpRequest = request;
Also note that you can turn a GM script into an addon for Firefox.

Develop Reference

JavaScript is the programming language of the Web.

Reading a file in real-time using Node.js - javascript

this module is an implementation of the principle #hasanyasin suggests: https://github.com/felixge/node-growing-file

Related

Adding [0] causes callback to be called twice?

Alfresco JavaScript/Rhino multi-thread processing and concurrency

IndexedDB via Lawnchair gets larger with every save

Uglify JS - compressing unused variables

Create Firefox Addon to Watch and modify XHR requests & reponses

Categories

Resources