CasperJS returns base64 string as image URL - javascript

I'm crawling a website using CasperJS and I have an interesting problem.
In Caspers's evaluate function, I want to extract src attribute from image <img> element. Here is code that will be executed in my evaluate function:
function crawl(){
var product = {};
try{
product.title = jQuery('#title').html();//Get title
product.price = document.getElementsByClassName("price")[0].innerHTML;//Get price
var imageSrc = jQuery("#next").attr("src")
product.image =imageSrc;
}
catch(e){
}
return JSON.stringify(product);
}
Here is how evaluate function is handled in Casper:
casper.then(function(){
scrappedProductInfo = this.evaluate(crawl);//Get info
console.log("Page crawled");
utils.dump(scrappedProductInfo);
});
When I execute my CasperJS script, from evalute() function in image attribute of returned object instead of image link I get base64 representation of the image:
\n.....
When I open the same page in Chrome and when I execute crawl function in Chrome console, I get src as link and not as base64 string. When I right-click on element and inspect it, i can clearly see URL and not base64 encoded string.
Any suggestions?

Related

Chromium createObjectUrl from Image comes out as net::ERR_FILE_NOT_FOUND

I know that my code works because I have been using it in firefox. When I switched to chrome, this code snippet has stopped working due to chrome unable to read the url generated by URL.createObjectURL().
export const image_preview = () => {
$('.js-thumbnail').off()
$('.js-thumbnail').on('change', function(event) {
// add event to each file input
const img = $(this).siblings('img')
const url = URL.createObjectURL(event.target.files[0]) // url to user's image
img.attr('src', url)
URL.revokeObjectURL(url) // free the allocated object
})
}
The url itself is generated but chromium fails to load it in my image tag.
Some posts suggests to use webkitURL api instead of URL but that didn't work either. Do you know the cause? Is this also a problem in other browsers as well?
You need to wait for the image has loaded before revoking its URL.
Image resources begin to load in a microtask after we set their src (among other reasons to allow setting crossOrigin properties after we set the src), so when the browser will read the src change, the line URL.revokeObjectURL(url) will already have been called, and the blob:// URL already pointing to nowhere.
So simply do
const img = $(this).siblings('img')
const url = URL.createObjectURL(event.target.files[0]) // url to user's image
img.attr('src', url)
img.one('load', (e) => { URL.revokeObjectURL(url); });

Download multiples files using javascript

I am trying to download multiple files from multiple URL's using javascript so far I have tried multiple options but it works only for 1 URL.
I have an array of URL's that needs to start multiple downloaded in the browser.
$(fileUrls).each(function(_index, fileUrl: any) {
let tempElement: any;
tempElement = document.createElement("A");
tempElement.href = fileUrl;
tempElement.download = fileUrl.substr(fileUrl.lastIndexOf("/") + 1);
document.body.appendChild(tempElement);
tempElement.click();
document.body.removeChild(tempElement);
});
Also tried using
$(fileUrls).each(function (_index, fileUrl: any) {
window.location.href = fileUrl;
});
But it works for only 1 URL rest of the calls fails with below warning message in the browser
Resource interpreted as Document but transferred with MIME type application/octet-stream
You can achieve this using below code.
create one array and pass all the file locations which you want to
download like below.
var files= [
'file1_Link',
'file2_Link',
'file3_Link'
];
function to download all the files. we are creating the anchor tag
with download attribute and on loop we need to click on that link.
please see below ex.
function downloadAll(files){
if(files.length == 0) return;
file = files.pop();
var theAnchor = $('<a />')
.attr('href', file[0]) // set index accordingly
.attr('download',file[0]) // set index accordingly
// Firefox does not fires click if the link is outside
// the DOM
.appendTo('body');
theAnchor[0].click();
theAnchor.remove();
downloadAll(files);
}
// call the function like below to achieve your goal
downloadAll(files);

get data from protocol buffer response into img src

i have a code which calls an image via URL.
Now the URL returns a protocol buffer, when i open the link separately in a new tab it shows text " ["imagename",[[null,null,"data:image/jpeg;charset;utf-8;base64,#encoded#"]]]"
since the URL returns a text response, is there any way i can get the whole response into a string(10K+ characters) and then i can slice it and put it in the img src.
i just want to put the whole code into single html file or is there a way to write proto schema inside the html code and then retrieve data from it. (I have just started with programming)
//Html
<img id="image" Src="#URL">
//javascript
var imgstring=document.getElementById("image");
//when i print this, I get it as "[object HTML ImageElement]"
//if i use the .value it gives the output as "undefined"
Maybe this would like to help you:
var byteArray = new Uint8Array(#Buffer Data#);
var blob = new Blob([byteArray]);
const url = window.URL.createObjectURL(blob);

Chrome blocking PDF views on web redirection to new tab via Top-Frame Navigations

As per the Chrome version >=60 the PDF view functionality by any top-frame navigations options like
<A HREF=”data:…”>
window.open(“data:…”)
window.location = “data:…”
has been blocked by Google for which the discussion can be found at Google Groups. Now the problem is how to display the PDF on web without explicitly or forcibly making PDF to download. My old code looked as below via window.open to view the PDF data
dataFactory.getPDFData(id, authToken)
.then(function(res) {
window.open("data:application/pdf," + escape(res.data));
},function(error){
//Some Code
}).finally(function(){
//Some Code
});
In above I extract the PDF data from server and display it. But since window.open is blocked by Chrome and as suggested by one of the expert over here to use <iframe> to open the PDF data and I tried but it's not working. It always says Failed to Load PDF Data as below
The updated JS code for the <iframe> looks as below:
dataFactory.getPDFData(id, authToken)
.then(function(res) {
$scope.pdfData = res.data;
},function(error){
//Some Code
}).finally(function(){
//Some Code
});
And the HTML looks as below:
<iframe src="data:application/pdf;base64,pdfData" height="100%" width="100%"></iframe>
How can I proceed and bring back the original PDF view functionality? I searched over other stack questions but out of luck on how to resolve this. May be I did something wrong or missed something with the iframe code but it's not working out.
After unable to find the desired result I came up with below approach to resolve the issue.
Instead of opening the PDF on new page what I did is as soon as user clicks on the Print button PDF file gets downloaded automatically. Below is the source for same.
//File Name
var fileName = "Some File Name Here";
var binaryData = [];
binaryData.push(serverResponse.data); //Normal pdf binary data won't work so needs to push under an array
//To convert the PDF binary data to file so that it gets downloaded
var file = window.URL.createObjectURL(new Blob(binaryData, {type: "application/pdf"}));
var fileURL = document.createElement("fileURL");
fileURL.href = file;
fileURL.download = serverResponse.name || fileName;
document.body.appendChild(fileURL);
fileURL.click();
//To remove the inserted element
window.onfocus = function () {
document.body.removeChild(fileURL)
}
In your old code :
"data:application/pdf," + escape(res.data)
In the new :
your iframe src is like "data:application/pdf;base64,pdfData"
Try to remove base64 from the src, it seems to be already present in the value of 'pdfdata'.

Get full URL from a hyperlink using jQuery / JavaScript

How can I retrieve the full URL to which an anchor links using jQuery / JavaScript consistently across all browsers? For example, I want to return http://www.mysite.com/mypage.aspx from .
I have tried the following:
$(this).attr('href'): The problem is that jQuery returns the exact value of the href (i.e., ../mypage.aspx).
this.getAttribute('href'): This works in Internet Explorer, but in FireFox it behaves the same as above.
What is the alternative? I cannot simply append the current site's path to the href value because that would not work in the above case in which the value of the href escapes the current directory.
You can create an img element and then set the src attribute to the retrieved href value. Then when you retrieve the src attribute it will be fully qualified. Here is an example function that I have used from http://james.padolsey.com/javascript/getting-a-fully-qualified-url/:
function qualifyURL(url){
var img = document.createElement('img');
img.src = url; // set string url
url = img.src; // get qualified url
img.src = null; // no server request
return url;
}

Categories

Resources