Dynamic web page scraping with npm cheerio and and request - javascript

I'm trying to scrape data from a site which has a base url and then dynamic routes. This particular site simply uses numbers, so I have this code to get the data:
for (var i = 1; i <= total; i++) {
var temp = base_url + i;
var result = "";
request(temp, function(error, response, body) {
var $ = cheerio.load(body);
var address_string = 'http://maps.google.com/?q=' + $('title').text();
//firebase
database.ref('events/' + i).set({
"address": address_string
});
});
}
However, the above code doesn't work, and doesn't add anything to the database. Does anyone know what's wrong?

I'm not sure about the reason, but one thing that will behave strangely in the code you wrote is that the variable i is not bound to the callback scope of the request, and the for loop will finish before any callback is called.
If this is the problem, there should only be one db entry for i === total.
This can be solved by doing an Array.forEach instead.

Related

Javascript custom library issue [duplicate]

This question already has answers here:
How do I return the response from an asynchronous call?
(41 answers)
Closed 5 years ago.
I have looked and looked and I am still scratching my head. If I have missed something obvious, I apologise. I have tried to create a custom library of functions that I have written myself (thanks stackoverflow for helping me work that one out....). I then have a javascript file that loads when the web page is called, which in turn calls said functions in my custom library.
I have a function called getConfig() that does the obvious. It gets a JSON file with the configuration details for my server that hosts all of my RESTful web services. When I step through the code, the configuration details are returning as I would expect, however, when I load the web page at full speed, the configuration object comes back as undefined. I thought it might be a timing thing, so I wrapped everything in a $(document).ready(function()) block, but no go. I even tried a window.onload = function(){} block to make sure everything is loaded before the custom libraries are called. No luck! Its doing my head in as I cannot for the life of me work out what is going on.
My custom library file looks like this with filename xtendLibs.js
var xtendLibs = {
getConfig : function(){
var CONFIG;
$.getJSON("/js/config.json", function(json){
CONFIG = json;
});
return CONFIG;
},
getObjects : function(config, medicareno, medicarelineno, objectType){
var object;
var urlString = config.scheme + config.muleHost + config.mulePort + ":/patients/";
switch(objectType){
case ("details") :
urlString = urlString + "details/" + medicareno + "/" + medicarelineno ;
break;
case ("appointments") :
urlString = urlString + "appointments/" + medicareno +"/" + medicarelineno;
break;
}
$.ajax({
type : 'GET',
url : urlString,
success : function(data){
object = data;
},
failure : function(){
alert("Failure");
}
});
return object;
},
getUrlParameters : function(){
var paramsArray = window.location.search.substring(1).split("&");
var obj = [];
var tempArray;
var paramName,paramValue;
for(var i = 0; i < paramsArray.length; i++){
tempArray = paramsArray[i].split("=");
paramName = tempArray[0];
paramValue = tempArray[1];
obj[paramName] = paramValue;
}
return obj;
}
};
The javascript file that calls the various functions in the above file looks like this appts.js
window.onload = function(){
var config, params, appointments;
params = xtendLibs.getUrlParameters(); // This line works - and params is returned
config = xtendLibs.getConfig(); // This line fails but will work if I step through the code
appointments = xtendLibs.getObjects( config,
params["medicareno"],
params["medicarelineno"],
"appointments");
console.log(params);
}
I am truly stumped. Any help would be greatly appreciated.
Ajax is async process, so when getJson is called it does not stop the execution of next statement.
getConfig : function(){
var CONFIG;
$.getJSON("/js/config.json", function(json){
CONFIG = json;
});
return CONFIG;
}
When getJson is called it switches to a new thread, and the next statement which is in this case is "return CONFIG;" is executed. However, CONFIG has not been defined yet, so it is returning as undefined.
How Could you solve this problem?
You could not solve this problem. Not using this code design. You could non async the ajax, but it will make your page freeze.
You could set a global variable "config" when "getConfig" is called and check whether the config variable is defined when executing any function concerning it, but the best approach would be to pass a function, containing all the statements to be executed when config has finished loading, in getConfig function and call it when "/js/config.json" has loaded.

Use JSON output from Flickr to display images from search

I need to display the images on my site from a JSON request.
I have the JSON:
https://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=6a970fbb976a06193676f88ef2722cc8&text=sampletext&sort=relevance&privacy_filter=1&safe_search=1&per_page=5&page=1&format=json&nojsoncallback=1
And I have the format I need to put the photo URL in:
https://www.flickr.com/services/api/misc.urls.html
But I don't know how I would loop through that, I found some examples similar, but I am still having trouble seeing what I need.
I am using JavaScript/jQuery to pull the info.
I figure I would have this in a loop.
CurrentPhotoUrl = 'https://farm'+CurrentPhotoFarm+'.staticflickr.com/'+CurrentPhotoServer+'/'+CurrentPhotoId+'_'+CurrentPhotoSecret+'_n.jpg'
But each of those variables would need to be populated with an value from the element. I would need to loop through all 5 elements that are in the JSON.
Any help on how to create this loop would be greatly appreciated.
Try this code
var n = JSON.parse(x) //x is the json returned from the url.
var _s = n.photos.photo;
for(var z = 0 ; z < n.photos.photo.length ; z++)
{
var CurrentPhotoUrl = 'https://farm'+_s[z]['farm']+'.staticflickr.com/'+_s[z]['server']+'/'+_s[z]['id']+'_'+_s[z]['secret']+'_n.jpg'
console.log(CurrentPhotoUrl);
}
Edit ( With actual JQUERY AJAX call )
var n ='';
$.ajax({url: "https://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=6a970fbb976a06193676f88ef2722cc8&text=sampletext&sort=relevance&privacy_filter=1&safe_search=1&per_page=5&page=1&format=json&nojsoncallback=1", success: function(result){
console.log(result);
n = result;
var _s = n.photos.photo;
for(var z = 0 ; z < n.photos.photo.length ; z++)
{
var CurrentPhotoUrl = 'https://farm'+_s[z]['farm']+'.staticflickr.com/'+_s[z]['server']+'/'+_s[z]['id']+'_'+_s[z]['secret']+'_n.jpg'
console.log(CurrentPhotoUrl);
}
}});
Output:
https://farm8.staticflickr.com/7198/6847644027_ed69abc879_n.jpg
https://farm3.staticflickr.com/2517/3905485164_84cb437a29_n.jpg
https://farm1.staticflickr.com/292/32625991395_58d1f16cea_n.jpg
https://farm9.staticflickr.com/8181/7909857670_a64e1dd2b2_n.jpg
https://farm9.staticflickr.com/8143/7682898986_ec78701508_n.jpg
This answer assumes your json data will not change. So inside a .js file, set your json to a variable.
var json = <paste json here>;
// the photos are inside an array, so use forEach to iterate through them
json.photos.photo.forEach((photoObj) => {
// the photo will render via an img dom node
var img = document.createElement('img');
var farmId = photoObj.farm;
// you can fill out the rest of the variables ${} in the url
// using the farm-id as an example
img.src = `https://farm${farmId}.staticflickr.com/${serverId}/${id}_${secret}.jpg`
// add the image to the dom
document.body.appendChild(img);
}
Inside your html file that contains a basic html template, load this javascript file via a script tag, or just paste it inside a script tag.
If you want to get the json from the web page and assuming you have the jquery script loaded...
$.ajax({
type: 'GET',
url: <flicker_url_for_json>,
success: (response) => {
// iterate through json here
},
error: (error) => {
console.log(error);
}
});
I'm not sure if this is the best solution but its is something someone suggested and it worked.
const requestURL = 'https://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=6a970fbb976a06193676f88ef2722cc8&text=sampletext&sort=relevance&privacy_filter=1&safe_search=1&per_page=5&page=1&format=json&nojsoncallback=1'
$.ajax(requestURL)
.done(function (data) {
data.photos.photo.forEach(function (currentPhoto) {
console.log('https://farm'+currentPhoto.farm+'.staticflickr.com/'+currentPhoto.server+'/'+currentPhoto.id+'_'+currentPhoto.secret+'_n.jpg')
})
})
Varun's solution worked for me as well. I don't know which one is better but I thought I would post this as well since it looks like they were done fairly differently.

ServiceNow UI Page GlideAjax

I created a form using UI Page and am trying to have some fields autopopulated onChange. I have a client script that works for the most part, but the issue arises when certain fields need to be dot-walked in order to be autopopulated. I've read that dot-walking will not work in client scripts for scoped applications and that a GlideAjax code will need to be used instead. I'm not familiar with GlideAjax and Script Includes, can someone help me with transitioning my code?
My current client script looks like this:
function beneficiary_1(){
var usr = g_user.userID;
var related = $('family_member_1').value;
var rec = new GlideRecord('hr_beneficiary');
rec.addQuery('employee',usr);
rec.addQuery('sys_id',related);
rec.query(dataReturned);
}
function dataReturned(rec){
//autopopulate the beneficiary fields pending on the user selection
if(rec.next()) {
$('fm1_ssn').value = rec.ssn;
$('fm1_address').value = rec.beneficiary_contact.address;
$('fm1_email').value = rec.beneficiary_contact.email;
$('fm1_phone').value = rec.beneficiary_contact.mobile_phone;
var dob = rec.date_of_birth;
var arr = dob.split("-");
var date = arr[1] + "/"+ arr[2] + "/" + arr[0] ;
$('fm1_date_of_birth').value = date;
}
}
fm1_address, fm1_email, and fm1_phone do not auto populate because the value is dot walking from the HR_Beneficiary table to the HR_Emergency_Contact table.
How can I transform the above code to GlideAjax format?
I haven't tested this code so you may need to debug it, but hopefully gets you on the right track. However there are a couple of steps for this.
Create a script include that pull the data and send a response to an ajax call.
Call this script include from a client script using GlideAjax.
Handle the AJAX response and populate the form.
This is part of the client script in #2
A couple of good websites to look at for this
GlideAjax documentation for reference
Returning multiple values with GlideAjax
1. Script Include - Here you will create your method to pull the data and respond to an ajax call.
This script include object has the following details
Name: BeneficiaryContact
Parateters:
sysparm_my_userid - user ID of the employee
sysparm_my_relativeid - relative sys_id
Make certain to check "Client callable" in the script include options.
var BeneficiaryContact = Class.create();
BeneficiaryContact.prototype = Object.extendsObject(AbstractAjaxProcessor, {
getContact : function() {
// parameters
var userID = this.getParameter('sysparm_my_userid');
var relativeID = this.getParameter('sysparm_my_relativeid');
// query
var rec = new GlideRecord('hr_beneficiary');
rec.addQuery('employee', userID);
rec.addQuery('sys_id', relativeID);
rec.query();
// build object
var obj = {};
obj.has_value = rec.hasNext(); // set if a record was found
// populate object
if(rec.next()) {
obj.ssn = rec.ssn;
obj.date_of_birth = rec.date_of_birth.toString();
obj.address = rec.beneficiary_contact.address.toString();
obj.email = rec.beneficiary_contact.email.toString();
obj.mobile_phone = rec.beneficiary_contact.mobile_phone.toString();
}
// encode to json
var json = new JSON();
var data = json.encode(obj);
return data;
},
type : "BeneficiaryContact"
});
2. Client Script - Here you will call BeneficiaryContact from #1 with a client script
function onChange(control, oldValue, newValue, isLoading, isTemplate) {
if (isLoading || newValue === '') {
return;
}
var usr = g_user.userID;
var related = $('family_member_1').value;
var ga = new GlideAjax('BeneficiaryContact'); // call the object
ga.addParam('sysparm_name', 'getContact'); // call the function
ga.addParam('sysparm_my_userid', usr); // pass in userID
ga.addParam('sysparm_my_relativeid', related); // pass in relative sys_id
ga.getXML(populateBeneficiary);
}
3. Handle AJAX response - Deal with the response from #2
This is part of your client script
Here I put in the answer.has_value check as an example, but you may want to remove that until this works and you're done debugging.
function populateBeneficiary(response) {
var answer = response.responseXML.documentElement.getAttribute("answer");
answer = answer.evalJSON(); // convert json in to an object
// check if a value was found
if (answer.has_value) {
var dob = answer.date_of_birth;
var arr = dob.split("-");
var date = arr[1] + "/"+ arr[2] + "/" + arr[0];
$('fm1_ssn').value = answer.ssn;
$('fm1_address').value = answer.address;
$('fm1_email').value = answer.email;
$('fm1_phone').value = answer.mobile_phone;
$('fm1_date_of_birth').value = date;
}
else {
g_form.addErrorMessage('A beneficiary was not found.');
}
}

Html templates loaded asynch (JQuery/JavaScript asynch)

So I'm making a webpage with some code snippets loaded in from txt files. The information to paths and locations of the txt files are stored in a json file. First I'm loaing the json file looking like this
[
{"root":"name of package", "html":"htmlfile.txt", "c":"c#file.txt", "bridge":"bridgefile"},
{"root":"name of package", "html":"htmlfile.txt", "c":"c#file.txt", "bridge":"bridgefile"}
]
After loaded I'm using templates from my index.html file and then inserting the templates. My problem is that its happening asynch so that the page never looks the same because of the asynch nature of js.
Here is what my jquery code for loading and inserting looks like:
$(document).ready(function () {
var fullJson;
$.when(
$.get('/data/testHtml/data.json', function (json) {
fullJson=json;
})
).then(function(){
for(var i=0; i<fullJson.length; i++){
templefy(fullJson[i],i);
}
})
var templefy = function (data, number) {
//Fetches template from doc.
var tmpl = document.getElementById('schemeTemplate').content.cloneNode(true);
//Destination for template inserts
var place = document.getElementsByName('examples');
//Set name
tmpl.querySelector('.name').innerText = data.root;
//Next section makes sure that each tap pane has a unique name so that the system does not override eventlisteners
var htmlNav = tmpl.getElementById("html");
htmlNav.id += number;
var htmlLink = tmpl.getElementById('htmlToggle');
htmlLink.href += number;
var cNav = tmpl.getElementById('c');
cNav.id += number;
var cLink = tmpl.getElementById('cToggle');
cLink.href += number;
var bridgeNav = tmpl.getElementById('bridge');
bridgeNav.id += number;
var bridgeLink = tmpl.getElementById('bridgeToggle');
bridgeLink.href += number;
//Auto creates the sidebar with links using a link template from doc.
var elementLink = tmpl.getElementById('elementLink');
elementLink.name +=data.root;
var linkTemplate = document.getElementById('linkTemplate').content.cloneNode(true);
var linkPlacement = document.getElementById('linkListWVisuals');
var link = linkTemplate.getElementById('link');
link.href = "#"+data.root;
link.innerText = data.root;
linkPlacement.appendChild(linkTemplate);
//Fetches html, c# and bridge code. Then inserts it into template and finally inserts it to doc
$.get("/data/" + data.root + '/' + data.html, function (html) {
tmpl.querySelector('.preview').innerHTML = html;
tmpl.querySelector('.html-prev').innerHTML = html;
$.get('/data/' + data.root + '/' + data.c, function (c) {
tmpl.querySelector('.c-prev').innerHTML = c;
$.get('/data/' + data.root + '/' + data.bridge, function (bridge) {
console.log(bridge);
tmpl.querySelector('.bridge-prev').innerHTML = bridge;
place[0].appendChild(tmpl);
})
})
})
}
});
So yeah my problem is that it just fires in templates whenever they are ready and not in the order written in the json file.
I'll take whatever help I can get..Thank you :)
To my knowledge, there is no golden method and i usually apply one of the following options:
1) Preload the files separately e.g. create key "body" for each entry in your json and then set the value of it to the contents of the file.
2) Do not display items until they are not fully loaded in the DOM and before you show them, sort them in the DOM by e.g. their position in the json array.
Hope it helps.
My only way out has been to make the whole application in angular instead and using a filter to make sure that I get the right result.

Send parameters to another page

I have a mystic problem with sending parametrs from one page to another.
In one of ExtJs methods i send parameter by POST to another page:
autoLoad : {
url : url_servlet+'form.jsp',
params: str,
scripts: true
}
But i dont know how to get this parametr in JavaScript. Okey i says, and sent parameter in url:
url : url_servlet+'form.jsp?ss=333'
And in another page:
function param(Name){
var Params = location.search.substring(1).split("&");
var variable = "";
for (var i = 0; i < Params.length; i++){
if(Params[i].split("=")[0] == Name){
if (Params[i].split("=").length > 1)
variable = Params[i].split("=")[1];
return variable;
}
}
return "";
}
var s =param('ss');
alert(s);
And see empty alert.
in firebug i try:
window.location.search
and get " ".
Whats wrong? I read several examples and everywere see code like this.
What's likely going on here is that ExtJS loads an entire page from a remote location into the current page.
When this happens, the code that gets run as a result of the load, will execute in the current page (which probably doesn't have the ss=xyz parameter at all).
However, your form.jsp should have access to the query string and can inject that into the page it returns to ExtJS.
Another option is to somehow pass that data from JavaScript once the page is loaded, but I don't know enough about ExtJS to tell you how that could be done.
Can you please try below function ?
function param(Name){
var Params = location.search.substring(1).split("&");
var variable = "";
for (var i = 0; i < Params.length; i++){
if(Params[i].split("=")[0] == Name){
variable = Params[i].split("=")[1];
return variable;
}
}
if(variable=="") return variable;
}
var s =param('ss');
alert(s);
You cannot get POST parameters from javascript. POST parameters are to the server and javascript is on the client side..
If its a GET then you could use parseUri library
var value = uri.queryKey['param'];

Categories

Resources