Node.js : scraping with Request and Cheerio issue - javascript

first off : I'm new to node, and a relative programming beginner.
I'm trying to create a small web app with Express, whose only goal is to fetch and reformat data from a website that doesn't have an open API.
To do so, I've decided to learn about scraping, and that brought me to Cheerio and Request.
I'm using reddit as an example, to learn on. The end goal in this example is to gather the name and href of the posts on the front page as well as the url leading to the comments, then to go on that page to scrape the number of comments.
What follows is the route that is called on a GET request to / (please excuse the variable names, and the comments/console.logs, I got frustrated) :
/*
* GET home page.
*/
exports.index = function(req, res){
var request = require('request')
, cheerio =require('cheerio')
, mainArr = []
, test = "test"
, uI
, commentURL;
function first() {
request("http://www.reddit.com", function(err, resp, body) {
if (!err && resp.statusCode == 200) {
var $ = cheerio.load(body);
$('.thing', '#siteTable').each(function(){
var url = $('a.title', this).attr('href')
, title = $('a.title', this).html()
, commentsLink = $('a.comments', this).attr('href')
, arr = [];
arr.push(title);
arr.push(url);
arr.push(commentsLink);
mainArr.push(arr);
});
second();
};
});
}
function second() {
for (i = mainArr.length - 1; i >= 0; i--) {
uI = mainArr[i].length - 1;
commentURL = mainArr[i][uI];
console.log(commentURL + ", " + uI + ", " + i);
var foo = commentURL;
request(foo, function(err, resp, body) {
console.log("what the shit");
// var $ = cheerio.load(body);
// console.log(mainArr.length + ", " + commentURL + ", " + i + ", " + uI);
// var test = $('span.title', 'div.content').html();
console.log(test + ", "+ foo + ", " + commentURL + ", " + i + ", " + uI);
// mainArr[1][2] = test;
});
};
if (i<=0) {
res.render('index', {title: test});
};
}
first();
};
The function first(); works as intended. It puts the title, the href and url to the comments in an array, then pushes that array in a master array containing those data points for all of the posts on the front page. It then calls the function second();
Said function's goal is to loop through the master array (mainArr[]), then select all of the urls leading to comments (mainArr[i][uI]) and launch a request() with that url as first parameter.
The loop works, but during the second call of request() inside the second() function, everything breaks down. The variable i gets set permanently at -1, and commentURL (the variable that is set to the URL of the comments of the current post), is defined permanently as the first url in arrMain[]. There are also weird behaviors with arrMain.length. Depending on where I place it, it tells me that arrMain is undefined.
I have a feeling that I'm missing something obvious (probably to do with asynchronicity), but for the life of me, I can't find it.
I would be really greatful for any suggestions!

You are correct about your guess, it's the infamous "Javascript loop Gotcha". See here, for example, for an explanation:
Javascript infamous Loop issue?
Besides that, it seems that only your debug prints are affected. The commented code regarding var test ought to work.
Finally, the kind of language is frowned upon in SO, you would do well to take 2 minutes and change your variable names in this post.

Related

How to pass sql query with . in name - Javascript/Node.JS - Alexa App

Good Day,
I am trying to have Alexa say the results of a SQOL query, but I receive an error every-time try to include owner.name in the output.
this.t("CASEINFO",resp.records[0]._fields.casenumber, resp.records[0]._fields.subject,resp.records[0]._fields.priority,resp.records[0]._fields.owner.name);
. I believe this is because it has a "." in the name, but I am not sure how to escape the period so it reads it correctly.
Note if I don't put "resp.records[0]._fields.owner.name" into the script, everything works without issue. I know, this is the reason for the error.
This what I have tried...
1) this.t("CASEINFO",resp.records[0]._fields.casenumber, resp.records[0]._fields.subject,resp.records[0]._fields.priority,resp.records[0]._fields.[owner.name]);
2) this.t("CASEINFO",resp.records[0]._fields.casenumber, resp.records[0]._fields.subject,resp.records[0]._fields.priority,resp.records[0]._fields.owner//.name);
3)putting into a var (var casenumber = owner.name) and then using casenumber in the query.
Any help would be appreciated.
'CaseInformation': function () {
console.log("CaseInformation function");
if (preFunctions.call(this)) {
//const OwnerName = getSlotValue(this.event.request.intent.slots.caseowner_name.value);
var CaseInfo = this.event.request.intent.slots.case_info.value;
console.log(`CaseInfo: ${CaseInfo}`);
const accessToken = this.event.session.user.accessToken;
sf.query("select casenumber,subject, owner.name, status, priority, account.name, lastmodifieddate from case where casenumber='" + CaseInfo + "'", accessToken, (err, resp) => {
if (resp.records!="") {
if (resp.records) {
const output = this.t("CASEINFO",resp.records[0]._fields.casenumber, resp.records[0]._fields.subject,resp.records[0]._fields.priority,resp.records[0]._fields.owner.name);
this.emit(":ask", output, this.t("PROMPT")); ```
Square bracket notation:
this.t("CASEINFO",resp.records[0]._fields.casenumber, resp.records[0]._fields.subject,resp.records[0]._fields.priority,resp.records[0]._fields["owner.name"]);

eventDrop, oldResource and newResource in fullcalendar scheduler v4

I'm trying to eventually create a php script to update mysql. As I go step by step and try to set the variables and have js behave correctly, I'm having difficulty with changing the 'resource' rooms in the scheduler.
If I change the location of the event and change the room, I get a JS alert saying "Title was dropped on to [new time] from [old time]. Old room [oldresource]. New Room [newresource]". That's working well.
However, if I move the event to a location on the same day, I get errors - because info.oldResource and info.newResource are only available IF the event has moved to a NEW resource. If they're moving within the same resource, the object doesn't exist.
I tried working in an IF statement. Console Log shows null, but my if statement is not stopping the processing of the rest of the code. This will eventually (I think) result in the commands not being run correctly - once I pass them to PHP.
I plan on having an 'else' statement that does not include oldResource or newResource dialogue to process changes that stay within the same resource.
My code is as such:
eventDrop: function(info) {
if (typeof info.oldResource !=="null") {
var oll = console.log(info.oldResource);
var nww = console.log(info.newResource);
var oldroom = (info.oldResource.id);
var newroom = (info.newResource.id);
var newtitle = info.event.title;
/*if (oldroom = newroom) {alert ("Same Room");}*/
alert(newtitle + " was dropped on " + info.event.start.toString() + ".Old Time: " + info.oldEvent.start.toString() + ". From: " + oldroom + ". To: " + newroom );
if (!confirm("Are you sure about this change?")) {
info.revert();
}}
},
You don't need typeof here. Just write
if (info.oldResource != null)
instead.
P.s. if you call typeof on a property that's set to null I would expect the typeof call to return "object", not "null". But like I said, it's irrelevant because you don't need it. Just check for the value null directly.

I can't read sqlite data in coffescript

Hello i try to read somethink from my sqlite in coffe.script when i wrote it JS it works well but now i got some problem
Coffee.script:
I am new in coffeescript and i am wondering what am i doing wrong... Any tips guys ? :)
app.get('/indeks',
(req, res)->
tab = []
i = 0
db = new sqlite3.Database("xxx.sqlite3")
tab = []
i=0
console.log("Jestem przed dbHandler")
db.each("SELECT yyy FROM zzz", #dbHandler, #dbFinal
dbHandler:(err, row)->
console.log("I am in handler dbHandler")
if err
console.log("Error: " + err)
else
tab.push(row)
console.log(row)
dbFinal:()->
console.log("I am in dbFinal")
console.log("Final: " + tab)
console.log("Response")
res.send(tab)
db.close()
)
)
Now code in JS:
app.get('/indeks', function (req, res, next) {
var db = new sqlite3.Database("xxx");
var tab = new Array();
var i=0;
function dbHandler(err, row){
if (err) {
console.log("Error: " + err);
} else {
tab.push(row);
console.log(row);
}
}
function dbFinal(){
console.log("Final: " + tab);
console.log("Response");
res.send(tab);
}
db.each("SELECT zzz FROM yyy", dbHandler, dbFinal);
db.close();
});
Did you look into the transpiled coffee code? When using something like dbHandler:(err, row)-> a JSON-Object with the property dbHandler is generated. This is why you cannot pass dbHandler and dbFinal to the db.each call. This only works when defining a class.
Additionally, you got an unmatched bracket in the line 10 and a bracket too much in the last two lines.
You should always check the compiled code (respectively check whether it even compiles). Here is a helpful site for this. There, you can even convert your JS code to coffeescript.

log object in log4javascript

I want to log objects using log4javascript. For example consider the following code:
function LogObject() {
var blah = {
one: 42,
two: "486"
};
logger.Info(blah);
Assuming that logger is instance of log4javascript logger that is properly set up:
var logger = log4javascript.getLogger("InternalLogger");
var ajaxAppender = new log4javascript.AjaxAppender(url),
jsonLayout = new log4javascript.JsonLayout(false, false);
ajaxAppender.setLayout(jsonLayout);
ajaxAppender.addHeader("Content-Type", "application/json");
logger.addAppender(ajaxAppender);
I am expecting the result to the following: request payload contains array of messages first of which is my object serialized into JSON. What I see is array of messages first of which has string "Object object" (like toString() method was invoked). How can I achieve that?
JsonLayout formats the logging event (which includes log level, timestamp and logger name in addition to the log message(s)) as JSON rather than the log message, which is pretty much assumed to be a string. The reason for this is to avoid a dependency on a JSON library for older browsers; generating JSON for the simple, known data that JsonLayout deals with is no problem without a JSON library but handling arbitrary objects definitely requires one.
The workaround I'd suggest is simply to format the message before you pass it to the logging call:
logger.info( JSON.stringify(blah) );
We were following #Tim Down's suggestion
logger.info( JSON.stringify(blah) );
But we had performance issues since the JSON.stringify happens before logger.info is called, therefore it will always happen even if the logging level is set to ignore this log.
In order to work around this I wrote a new lazy layout so that the stringification only happens if the log is actually output. In order to be more flexible it also alows passing a function, in which case it outputs the result of running said function.
Usage:
logger.trace("Received ", widget, " which has ", () => countFrimbles(widget), ' frimbles');
Implementation:
function LazyFormatLayout() { }
LazyFormatLayout.prototype = new log4javascript.Layout();
LazyFormatLayout.prototype.format = function (loggingEvent) {
var time = loggingEvent.timeStamp.toTimeString().split(/\s/)[0];
var head = time + ' ' + loggingEvent.logger.name + ' [' + loggingEvent.level.name + '] - ';
var body = loggingEvent.messages.map(function (arg) {
try {
switch (typeof (arg)) {
case 'function':
return arg();
case 'object':
return JSON.stringify(arg);
}
}
catch (e) {
return '<<error while logging: ' + e.stack + '>>';
}
return arg;
}).join('');
if (!loggingEvent.exception)
return head + body;
return head + body + ' ==> Exception: ' + loggingEvent.exception.stack;
}
LazyFormatLayout.prototype.ignoresThrowable = function () { return false; };
LazyFormatLayout.prototype.toString = function () { return "LazyFormatLayout"; };
Question is somewhat dated, but a simple google search turned up this question and there seems to be a build-in way to log objects:
var log = log4javascript.getDefaultLogger();
log.info("log following object",{ data:5, text:"bla" });
output
12:49:43 INFO - log following object {
data: 5,
text: bla
}

Running a javascript function based on statechange

Following the answer in this stackoverflow question, I am trying to run the following code. But the myfunction takes only one google visualization event. So Is the following code is valid? Or how to handle multiple statechange google visualization events in a single function?
var categoryPicker1, categoryPicker2;
function drawVisualization() {
// etc.
categoryPicker1 = // etc...
categoryPicker2 = // etc...
// Register to hear state changes.
google.visualization.events.addListener(categoryPicker1, 'statechange', myfunction);
google.visualization.events.addListener(categoryPicker2, 'statechange', myfunction);
// etc.
}
function myfunction() {
var whereClauses = [];
if (categorypicker1) {
whereClauses.push("something1 = '" + document.getElementsByClassName('goog-inline-block goog-menu-button-caption')[0].innerHTML + "'")
}
if (categorypicker2) {
whereClauses.push("something2 = '" + document.getElementsByClassName('goog-inline-block goog-menu-button-caption')[1].innerHTML + "'")
}
whereClause = whereClauses.join(" AND ");
// do something....
}
Not really clear from your question, but I assume you're building the SQL query to your database from the selected items in the CategoryPicker. Despite being an EXTREMELY bad/dangerous thing to do (building SQL client side, and sending it to a server), this should be possible by just grabbing the selectedItems from your CategoryPicker, and joining them with " AND ". Like:
values = categoryPicker1.getState().selectedValues;
values = values.concat(categoryPicker2.getState().selectedValues);
var args = values.map(function(_) { return "'" + _ + "'"; });
console.log(args.join(" AND "));
I wouldn't do this if I were you. I would pass the arguments up to the server, and remap them there (after appropriately filtering them, etc). Again this is very dangerous.

Categories

Resources