I am trying to make a website, in which I include images via links. If an image is non-existent, or just 1 pixel wide, the website should display an alternative image. I am using Jade/Pug and JS.
I try to make a list of links beforehand, before rendering them on the website. That way I can just iterate threw my link-list in the .pug file afterwards.
So what I am trying to do is, to check, if an image has a certain size, using JS only. If it does, I add the link to my list, if not, then I add an alternative link.
This is the important part of my code in the app.js-file:
app.get("/", function (req, res) {
//get all the books
var isbns = [];
var links = [];
dbClient.query("SELECT isbn FROM book WHERE id < 20", function (dbError, dbItemsResponse){
isbns = dbItemsResponse.rows;
var linkk = 0;
for(i=0;i<20;i++){
linkk = "http://covers.openlibrary.org/b/isbn/" + Object.values(isbns[i]) + "-M.jpg";
var wid = getMeta(linkk);
if(wid < 2){
links[i]="https://i.ibb.co/7JnVtcB/NoCover.png";
} else {
links[i]=linkk;
}
}
});
});
function getMeta(url){
var img = new Image(); //or document.createElement("img");
img.onload = function(){
alert(this.width);
};
img.src = url;
}
This gives me a ReferenceError: Image() is not defined. If I try to use document.createElement("img"); it says "document is not defined".
How can i check on the server-side if an Image is existent? Or how can I use the Image-Constructor in my app.js file? Without using my Jade/Pug/HTML file in any way.
Sorry If it's a dumb question, but I am trying to figure this out since 20 hours non-stop, and I can't get it to work.
You are mixing up nodejs and javascript. Your code is nodejs and therefore on the sererside. window and Image are only available in the browser, resp. on the client side.
For checking if a file exists, (Only on the serverside!=) you can use fs => fs.access.
var fs = require("fs");
// Check if the file exists in the current directory.
fs.access(file, fs.constants.F_OK, (err) => {
console.log(`${file} ${err ? 'does not exist' : 'exists'}`);
});
Note
There isn't something like a "dumb question" :=)
Related
I'm quite new to this so please bare with me.
I'm currently trying to put together an HTML report building tool.
I have 2 html reports that are being generated by 3rd parties.
I'd like to be able to upload them, parse them, save the specific parse to a variable and update my template which is in a folder on the server.
Currently, I'm using express, and node-html-parser.
I have no issues getting the HTML files uploaded to a directory on the server and parsing those files.
My issue comes in when I try to update the variable I want with the string that I want.
const fs = require('fs');
const htmlparse = require('node-html-parser').parse;
var element1
var element2
function datatoString(){
fs.readFile(__dirname + "/api/upload/" + file1, 'utf8', (err,html)=>{
const root = htmlparse(html);
head = root.querySelector('head');
element1 = head.toString();
console.log("-------------break------------")
console.log(head.toString()); //This works and shows me my parsed info
});
fs.readFile(__dirname + "/api/upload/" + file2, 'utf8', (err,html)=>{
const root = htmlparse(html);
body = root.querySelector('body');
element2 = body.toString();
console.log("-------------break------------")
console.log(body.toString()); //This works and shows me my parsed info
});
};
Now, ideally I'd like to call back this function in a GET request and have it update the variables. From there, I would use those strings to modify a template HTML file that's sitting in a folder on my server. I'd like to be able to replace html elements in the template with those updated variables. Once updated, id push the response to download the file.
Every time I try this with a fs.writeFile , it seems to just say the variables 'element1' or 'element2' are empty.
I'm not even sure if I can write a local HTML file and save it the same way you'd normally do it with the DOM.
I'm lost at this point. I would assume I'd need to read then write the template html file. but how i'd go about editing it, I have no clue. Also, the variables being empty is stumping me. I know it's due to the fact that fs.readFile is asynchronous, but then how would I go about reading and writing files in the manner I am looking for?
any help would be much appreciated!
You have two possibilities: use fs.readFileSync, which is easy to use but since it 's synchronous, it blocks your thread (and makes your server unresponsive while the files are being read). The more elegant solution is to use the Promise version and to await it.
const promises = require('fs').promises;
const htmlparse = require('node-html-parser').parse;
let element1, element2;
async function datatoString() {
let html = await promises.readFile(__dirname + "/api/upload/" + file1, 'utf8');
let root = htmlparse(html);
head = root.querySelector('head');
element1 = head.toString();
console.log("-------------break------------")
console.log(element1);
html = await promises.readFile(__dirname + "/api/upload/" + file2, 'utf8');
root = htmlparse(html);
body = root.querySelector('body');
element2 = body.toString();
console.log("-------------break------------")
console.log(element2);
};
You have two options here. One is to block the thread and wait for each consecutive read to end before ending the function.
function datatoString() {
let element1, element2;
fs.readFileSync(... element1 = 'foo'});
fs.readFileSync(... element2 = 'bar'});
return [element1, element2];
}
app.get('/example', (req, res) => {
...
const [element1, element2] = datatoString();
}
The other would be to use async and read both files at the same time, then return whenever they both finish:
function datatoString() {
return new Promise((resolve, reject) => {
let element1, element2;
fs.readFile(... element1 = 'foo', if (element2) resolve([element1, element2]);});
fs.readFile(... element2 = 'bar', if (element1) resolve([element1, element2]);});
});
}
app.get('/example', async (req, res) => {
...
const [element1, element2] = await datatoString();
}
I have a repetitive task that I have to do at regular intervals. Basically, I need to enter the website, get some values from different tables then write them on spreadsheet. By using these values, make some calculation, prepare a report etc.
I would like to create a helper bot because this is straight forward task to do.
I can basically get information by opening up console (while I am on the related page) and by using DOM or Jquery I am fetching data easily.
I would like to take it a step further and create an application on Node.js (without entering related website, I will send my bot to related page and do same actions that I do on console.)
I started to write something with cheerio. However, at some point my bot needs to click a button (in order to change table). I searched but couldn't find the way.
My question is "clicking a button on server side (change the table) and fetch data from that table is possible ?"
If do you know better way to create this kind of bot, please make suggestion.
var express = require('express');
var fs = require('fs');
var request = require('request');
var cheerio = require('cheerio');
var app = express();
app.get('/scrape', (req, res) => {
url = 'http://www.imdb.com/title/tt1229340/';
request(url, function(error, response, html){
if(!error){
var $ = cheerio.load(html);
var title, release;
var json = { title : "", release : ""};
$('.header').filter(() => {
var data = $(this);
title = data.children().first().text();
release = data.children().last().children().text();
json.title = title;
json.release = release;
})
// This is not possible
$( "#target" ).click(function() {
alert( "Handler for .click() called." );
});
}
fs.writeFile('output.json', JSON.stringify(json, null, 4), (err) => {
console.log('File successfully written!);
})
res.send('Check your console!')
}) ;
})
app.listen('8080');
edit: The Answer of this question is "Use Zombie"
Now I have another question related to this one.
I am trying to learn & use zombie. I could
connect to website
go to necessary table
print console all tds
However by using this method, I could only get really messed up string. (All tds were printed without any whitespace, no chance to clean out, basically I want to put all tds in an array. How can I do that ?)
browser.visit(url, () => {
var result = browser.text('table > tbody.bodyName td');
console.log(result);
})
I'd suggest you try using a headless browser such as Phantom.js or Zombie for this purpose. What you're trying to do above is assign a click handler to an element in Cheerio, this won't work!
You should be able to click a button based on the element selector in Zombie.js.
There's a browser.pressButton command in Zombie.js for this purpose.
Here's some sample code using zombie.js, in this case clicking a link..
const Browser = require('zombie');
const url = 'http://www.imdb.com/title/tt1229340/';
let browser = new Browser();
browser.visit(url).then(() => {
console.log(`Visited ${url}..`);
browser.clickLink("FULL CAST AND CREW").then(() => {
console.log('Clicked link..');
browser.dump();
});
}).catch(error => {
console.error(`Error occurred visiting ${url}`);
});
As for the next part of the question, we can select elements using zombie.js and get an array of their text content:
const Browser = require('zombie');
const url = 'http://www.imdb.com/title/tt1229340/';
let browser = new Browser();
browser.visit(url).then(() => {
console.log(`Visited ${url}..`);
var result = browser.queryAll('.cast_list td');
var cellTextArray = result.map(r => r.textContent.trim())
.filter(text => text && (text || '').length > 3);
console.log(cellTextArray);
}).catch(error => {
console.error(`Error occurred visiting ${url}`);
});
I'm trying to get a list of all image src url's in a given webpage using PhantomJS. My understanding is that this should be extremely easy, but for whatever reason, I can't seem to make it work. Here is the code I currently have:
var page = require('webpage').create();
page.open('http://www.walmart.com');
page.onLoadFinished = function(){
var images = page.evaluate(function(){
return document.getElementsByTagName("img");
});
for(thing in a){
console.log(thing.src);
}
phantom.exit();
}
I've also tried this:
var a = page.evaluate(function(){
returnStuff = new Array;
for(stuff in document.images){
returnStuff.push(stuff);
}
return returnStuff;
});
And this:
var page = require('webpage').create();
page.open('http://www.walmart.com', function(status){
var images = page.evaluate(function() {
return document.images;
});
for(image in images){
console.log(image.src);
}
phantom.exit();
});
I've also tried iterating through the images in the evaluate function and getting the .src property that way.
None of them return anything meaningful. If I return the length of document.images, there are 54 images on the page, but trying to iterate through them provides nothing useful.
Also, I've looked at the following other questions and wasn't able to use the information they provided: How to scrape javascript injected image src and alt with phantom.js and How to download images from a site with phantomjs
Again, I just want the source url. I don't need the actual file itself. Thanks for any help.
UPDATE
I tried using
var a = page.evaluate(function(){
returnStuff = new Array;
for(stuff in document.images){
returnStuff.push(stuff.getAttribute('src'));
}
return returnStuff;
});
It threw an error saying that stuff.getAttribute('src') returns undefined. Any idea why that would be?
#MayorMonty was almost there. Indeed you cannot return HTMLCollection.
As the docs say:
Note: The arguments and the return value to the evaluate function must be a simple primitive object. The rule of thumb: if it can be serialized via JSON, then it is fine.
Closures, functions, DOM nodes, etc. will not work!
Thus the working script is like this:
var page = require('webpage').create();
page.onLoadFinished = function(){
var urls = page.evaluate(function(){
var image_urls = new Array;
var images = document.getElementsByTagName("img");
for(q = 0; q < images.length; q++){
image_urls.push(images[q].src);
}
return image_urls;
});
console.log(urls.length);
console.log(urls[0]);
phantom.exit();
}
page.open('http://www.walmart.com');
i am not sure about direct JavaScript method but recently i used jQuery to scrape image and other data so you can write script in below style after injecting jQuery
$('.someclassORselector').each(function(){
data['src']=$(this).attr('src');
});
document.images is not an Array of the nodes, it's a HTMLCollection, which is built off of an Object. You can see this if you for..in it:
for (a in document.images) {
console.log(a)
}
Prints:
0
1
2
3
length
item
namedItem
Now, there are several ways to solve this:
ES6 Spread Operator: This turns array-likes and iterables into arrays. Use like so [...document.images]
Regular for loop, like an array. This takes advantage of the fact that the keys are labeled like an array:
for(var i = 0; i < document.images.length; i++) {
document.images[i].src
}
And probably more, as well
Using solution 1 allows you to use Array functions on it, like map or reduce, but has less support (idk if the current version of javascript in phantom supports it).
I used the following code to get all images on the page loaded, the images loaded on the browser changed dimensions on the basis of the view port, Since i wanted the max dimensions i used the the maximum view port to get the actual image sizes.
Get All Images on Page USING Phantom JS
Download All Images URL on Page USING Phantom JS
No Matter even if the image is not in a img tag below code you can retrieve the URL
Even Images from such scripts will be retrieved
#media screen and (max-width:642px) {
.masthead--M4.masthead--textshadow.masthead--gradient.color-reverse {
background-image: url(assets/images/bg_studentcc-750x879-sm.jpg);
}
}
#media screen and (min-width:643px) {
.masthead--M4.masthead--textshadow.masthead--gradient.color-reverse {
background-image: url(assets/images/bg_studentcc-1920x490.jpg);
}
}
var page = require('webpage').create();
var url = "https://......";
page.settings.clearMemoryCaches = true;
page.clearMemoryCache();
page.viewportSize = {width: 1280, height: 1024};
page.open(url, function (status) {
if(status=='success'){
console.log('The entire page is loaded.............################');
}
});
page.onResourceReceived = function(response) {
if(response.stage == "start"){
var respType = response.contentType;
if(respType.indexOf("image")==0){
console.log('Content-Type : ' + response.contentType)
console.log('Status : ' + response.status)
console.log('Image Size in byte : ' + response.bodySize)
console.log('Image Url : ' + response.url)
console.log('\n');
}
}
};
I'm in the process of creating a site that preloads several large gifs. Due to the size of the images. I need them all to be loaded before displayed to the user. In the past I have done this numerous times using something basic like this:
var image = new Image();
image.onload = function () { document.appendChild(image); }
image.src = '/myimage.jpg';
However, i'm loading a group of images from an array, which contains the image source url. It should show a loading message and once they have all loaded it show perform a callback and hide the loading message etc.
The code I've been using is below:
var images = ['image1.gif', 'image2.gif', 'image3.gif'];
function preload_images (target, callback) {
// get feedback container
var feedback = document.getElementById('feedback');
// show feedback (loading message)
feedback.style.display = 'block';
// set target
var target = document.getElementById(target);
// clear html of target incase they refresh (tmp fix)
target.innerHTML = '';
// internal counter var
var counter = 0;
// image containers attach to window
var img = new Array();
// loop images
if (images.length > 0) {
for (var i in images) {
// new image object
img[i] = new Image();
// when ready peform certain actions.
img[i].onload = (function (value) {
// append to container
target.appendChild(img[value]);
// hide all images apart from the first image
if (value > 0) {
hide(img[value]);
}
// increment counter
++counter;
// on counter at correct value use callback!
if (counter == images.length) {
// hide feedback (loading message)
feedback.style.display = 'none';
if (callback) {
callback(); // when ready do callback!
}
}
})(i);
// give image alt name
img[i].alt = 'My Image ' + i;
// give image id
img[i].id = 'my_image_' + i
// preload src
img[i].src = images[i];
}//end loop
}//endif length
}//end preload image
It's really weird, I'm sure it should just work, but it doesn't even show my loading message. It just goes straight to the callback.. I'm sure it must be something simple, I've been busy and looking at it for ages and finding it a tad hard to narrow down.
I've been looking over stackoverflow and people have had similar problems and I've tried the solutions without much luck.
Any help would be greatly appreciated! I'll post more code if needed.
Cheers!
If I'm not totally wrong the problem is with you assignment to
// when ready peform certain actions.
img[i].onload = (function (value) {...})(i);
here you instantly call and execute the function and return undefined to the onload attribute, what can not be called when the image is loaded.
What you can do to have access to the value 'i' when the image is loaded you can try something like the following:
onload = (function(val){
var temp = val;
return function(){
i = temp;
//your code here
}
})(i);
this should store the value in temp and will return a callable function which should be able to access this value.
I did not test that if it is working and there maybe a better solution, but this one came to my mind :)
Try this for your onload callback:
img[i].onload = function(event) {
target.appendChild(this);
if (img.indexOf(this) > 0) {
hide(this);
}
// ...
};
Hope you can get it working! It's bed time for me though.
Edit: You'll probably have to do something about img.indexOf(this)... just realized you are using associative array for img. In your original code, I don't think comparing value to 0 is logical in that case, since value is a string. Perhaps you shouldn't use an associative array?
I have a problem with duplicating layers from one document to another. I have this code (.jsx script inside my Photoshop document)
var docRef = app.activeDocument;
app.activeDocument.selection.selectAll();
var calcWidth = app.activeDocument.selection.bounds[2] -app.activeDocument.selection.bounds[0];
var calcHeight = app.activeDocument.selection.bounds[3] - app.activeDocument.selection.bounds[1];
var docResolution = app.activeDocument.resolution;
var document = app.documents.add(calcWidth, calcHeight, docResolution);
app.activeDocument = docRef;
try {
dupObj.artLayers[i].duplicate(document, ElementPlacement.INSIDE);
}
catch(e) {
alert(e)
}
But I am still receiving an error
Error: You can only duplicate layers from the frontmost document.
Have you any ideas how to make it work?
The reason you're getting an error is dupObj is never defined. I think you mean to use docRef, the reference to your source document in line 1. This seems to work fine now:
var docRef = app.activeDocument;
app.activeDocument.selection.selectAll();
var calcWidth = app.activeDocument.selection.bounds[2] -app.activeDocument.selection.bounds[0];
var calcHeight = app.activeDocument.selection.bounds[3] - app.activeDocument.selection.bounds[1];
var docResolution = app.activeDocument.resolution;
var document = app.documents.add(calcWidth, calcHeight, docResolution);
app.activeDocument = docRef;
try {
docRef.artLayers[i].duplicate(document, ElementPlacement.INSIDE); // ** changed to docRef **
}
catch(e) {
alert(e)
}
That being said there might be a few hidden bugs in there you should look at. In this line:
docRef.artLayers[i].duplicate(document, ElementPlacement.INSIDE);
i is never defined, and apparently defaults to 0 without throwing an error. The result is you will only ever duplicate the first layer in the artLayers array.
Also, since you are selecting the entire document using app.activeDocument.selection.selectAll(); there is no need to calculate the size of the selection. It will always be the same size as the original document. You could just use docRef.width and docRef.height as the width and height for the new document. Besides, when you duplicate a layer it will copy the whole layer regardless of the selection, as far as I know.
If you just want to make a new document the same size as the layer you are duplicating try using artLayers[i].bounds instead of selection.bounds
You're not calling the active document: You need to call a reference to the active document and the one your using - hence the error.
var docRef = app.activeDocument;
docRef.selection.selectAll();
var calcWidth = docRef.selection.bounds[2] -app.activeDocument.selection.bounds[0];
var calcHeight = docRef.selection.bounds[3] - app.activeDocument.selection.bounds[1];
var docResolution = docRef.resolution;
var document = app.documents.add(calcWidth, calcHeight, docResolution);
app.activeDocument = docRef;
try {
dupObj.artLayers[i].duplicate(document, ElementPlacement.INSIDE);
}
catch(e) {
alert(e)
}
I've not used dupObj before as I use CS and script listener code for duplicating documents
And I've not checked the code, but give it a go.
The problem is that you're trying to use a variable called document which is reserved in JS.
As Sergey pointed out, document is (amazingly) not a reserved word in JSX because Adobe JSX is not 'regular' JSX
Although it doesn't address the exact syntax error I'll leave this here because it's a quick way to solve the overall problem of copying layers between documents.
// Grab docs
const doc1 = app.activeDocument
const doc2 = app.documents.add(100, 100)
const outputLayer = doc1.layers[0]
const inputLayer = doc2.layers[0]
inputLayer.duplicate(outputLayer, ElementPlacement.INSIDE)