Javascript inside of iframe. Scraping with watir

Javascript inside of iframe. Scraping with watir - javascript

I'm trying to figure out WATIR.
Here is a situation. I want to monitor ads in few websites, but scraping them is not easy task because they are in iframe, then there is another iframe links which is generated with javascript. Only then comes the page which I would like to get.
Here is the code in main page:
<iframe width="300" height="250" scrolling="no" frameborder="0"
id="adbottomleft" src="/ad/left1" name="adbottomleft"></iframe>
Here is what the iframe says:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title></title>
<style type="text/css">
body {
background-color: black;
margin:0;
padding:0;
}</style>
</head>
<body>
<!-- Rubicon Project Tag -->
<!-- Site: MangaReader Zone: ROS_BTF_LEFT Size: Medium Rectangle -->
<div id="adfooter" style="width:300px;height:250px;"></div>
<script language="JavaScript" type="text/javascript">
function tl(){
var loaded = 0;
try {
loaded = parent.document['adver'];
} catch(e) { loaded = 0; }
if (loaded != 1) {
setTimeout(tl, 25);
} else {
var dest = document.getElementById('adfooter');
var lframe = document.createElement('iframe');
lframe.setAttribute('id','adbleft');
lframe.setAttribute('width','300');
lframe.setAttribute('height','250');
lframe.setAttribute('scrolling','no');
lframe.setAttribute('frameborder', '0');
lframe.setAttribute('src', 'http://ad.mangareader.net/btleft1');
dest.appendChild(lframe);
}
}
(function (){
tl();
}());
</script>
</body>
</html>
It does generate another iframe which looks like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title></title>
<style type="text/css">
* {
margin:0;
padding:0;
}
body {
margin-left: 0px;
margin-top: 0px;
}
</style>
</head>
<body>
<!-- Rubicon Project Tag -->
<!-- Site: MangaReader Zone: ROS_BTF_LEFT Size: Medium Rectangle -->
<script language="JavaScript" type="text/javascript">
var cb = Math.random();
var d = document;
var iframe = "&fr=" + (window != top);
var ref = "";
try {
if (window != top) {
ref = "&rf="+escape(d.referrer);
}
} catch (ignore) { }
d.write("<iframe id='25504.15' name='25504.15' src='' framespacing='0' frameborder='no' scrolling='no' align='middle' width='300' height='250' marginheight='0' marginwidth='0'></iframe>");
d.getElementById('25504.15').src='http://optimized-by.rubiconproject.com/a/8240/13310/25504-15.html?cb='+cb+ref;
</script>
</body>
</html>
Only then comes the final page which I'm interested to scrape.
<html>
<head>
<meta http-equiv="Pragma" content="no-cache">
<meta http-equiv="expires" content="0">
<style type="text/css"> body {margin:0px; padding:0px;} </style>
<script type="text/javascript">
rubicon_cb = Math.random(); rubicon_rurl = document.referrer; if(top.location==document.location){rubicon_rurl = document.location;} rubicon_rurl = escape(rubicon_rurl);
window.rubicon_ad = "3260765" + "." + "js";
window.rubicon_creative = "3299047" + "." + "js";
</script>
</head>
<body>
<img src="http://assets.rubiconproject.com/campaigns/100/91/16/5/1325630095ap_300.jpg" border="0" alt="AnimePremium.net" /><script defer="defer" type="text/javascript">
{
if (Math.floor(Math.random()*100) < 1)
{
var url;
var iframe = (window != top);
url = "http://tap.rubiconproject.com/stats/iframes?pc=8240/13310&ptc=25504&upn="+iframe;
setTimeout(function(){ new Image().src = url }, 1000);
}
}
</script>
<script>var _comscore = _comscore || []; _comscore.push({ c1: "8", c2: "6135404", c3: "28", c4: "13310", c10: "3299047" }); (function() { var s = document.createElement("script"), el = document.getElementsByTagName("script")[0]; s.async = true; s.src = (document.location.protocol == "https:" ? "https://sb" : "http://b") + ".scorecardresearch.com/beacon.js"; el.parentNode.insertBefore(s, el); })();</script><DIV STYLE="height:0px; width:0px; overflow:hidden"><IFRAME SRC="http://tap2-cdn.rubiconproject.com/partner/scripts/rubicon/emily.html?rtb_ext=1&pc=8240/13310&geo=eu" FRAMEBORDER="0" MARGINWIDTH="0" MARGINHEIGHT="0" SCROLLING="NO" WIDTH="0" HEIGHT="0" style="height:0px; width:0px"></IFRAME></DIV>
</body>
</html>
Impossible task?
here is what I'm doing.
irb
require "watir-webdriver"
browser = Watir::Browser.new :ff
browser.goto "mangareader.net"
browser.frame(:id, "adbottomleft").html - Works!
If I want to get one more layer down I get error
irb
require "watir-webdriver"
browser = Watir::Browser.new :ff
browser.goto "mangareader.net"
browser.frame(:id, "adbottomleft").frame(:id, "adleft").html -> Don't work.
Element belongs to a different frame than the current one - switch to it's containing frame to use it.
What should I change in the 2nd code to make it read the next iframe?
I have been searching for days. Started with selenium then htmunit with c# then tried mechanize with python, but couldn't achieve wanted results.
I keep jumping. I finally thought that I will be able to achieve what I wanted with WATIR.
I need some help to get this done. Any tips?

The ID of the frame created by the script is "adbleft" not "adleft" that might be your problem
browser.frame(:id => "adbottomleft").frame(:id => "adbleft").html
If the id of the final frame is not static, you might have to select it by index
browser.frame(:id => "adbottomleft").frame(:id => "adbleft").frame(:index => 0)

Related

How do I get pixels from Canvas created by js-dosbox

I have worked through the dosbox Div to find the canvas, but once I have found the node holding the canvas how can I reference it?
Getting the context of dbGranChild[0] just results in an error..
Im trying to build an array of the pixels that make up the dosbox window, so thought using the canvas get image and looping through as frames change would be one way. If there is a better way altogether than my above attempt happy to take that as an answer.
Code: http://plnkr.co/edit/MC1n9HfwWcqXlAk95XCO?p=preview
<!doctype html>
<html lang="en-us">
<head>
<meta charset="utf-8">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>js-dos api</title>
<style type="text/css">
.dosbox-container { width: 640px; height: 400px; }
.dosbox-container > .dosbox-overlay { background: url(https://js-dos.com/cdn/digger.png); }
.dosbox-start { font-size: 35px !important; }
</style>
</head>
<body>
<div id="dosbox"></div>
<br/>
<button onclick="dosbox.requestFullScreen();">Make fullscreen</button>
<script type="text/javascript" src="https://js-dos.com/cdn/js-dos-api.js"></script>
<script type="text/javascript">
var dosbox = new Dosbox({
id: "dosbox",
onload: function (dosbox) {
dosbox.run("https://js-dos.com/cdn/digger.zip", "./DIGGER.COM");
},
onrun: function (dosbox, app) {
console.log("App '" + app + "' is runned");
}
});
var dosboxId = document.getElementById('dosbox');
dbChild = dosboxId.childNodes;
dbGranChild = dbChild[0].childNodes;
console.log(dbGranChild[0])
</script>
</body>
</html>

See the w3Schools tutorial.
First, you need to use a <canvas> tag instead of a <div>.
That is, replace this:
<div id="dosbox"></div>
with something like this:
<canvas id="dosbox" width="200" height="100" style="border:1px solid #000000;">
</canvas>`
Second, replace this code:
var dosboxId = document.getElementById('dosbox');
dbChild = dosboxId.childNodes;
dbGranChild = dbChild[0].childNodes;
console.log(dbGranChild[0])
With something like this:
var c = document.getElementById("dosbox");
var ctx = c.getContext("2d");
ctx.moveTo(0, 0);
ctx.lineTo(200, 100);
ctx.stroke();

Inserting js into iframe not working as expected

I am trying to copy jsfiddle's feature on my web page - a user can submit js on a page and it will be executed within an iframe. I ran some test code in jsfiddle and on my page. It works in jsfiddle, but not on my page. Any help is appreciated!
My page renders the css and html, but the js is not executing (the div background should be blue):
Here is the html in the iframe of the fiddle (which works):
<html><head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<meta name="robots" content="noindex, nofollow">
<meta name="googlebot" content="noindex, nofollow">
<script type="text/javascript" src="/js/lib/dummy.js"></script>
<link rel="stylesheet" type="text/css" href="/css/result-light.css">
<style type="text/css">
.test { display:inline-flex; padding:40px; background-color:#cccccc }
</style>
<title></title>
<script type="text/javascript">//<![CDATA[
window.onload=function(){
document.querySelector('.test').style.backgroundColor="rgb(21, 160, 249)";
}//]]>
</script>
</head>
<body>
<div class="test" style="background-color: rgb(21, 160, 249);">test</div>
</body></html>
Here is the output to my web page:
<html><head><style>
.test { display:inline-flex; padding:40px; background-color:#cccccc }
</style><script type="text/javascript">//<![CDATA[
window.onload=function(){
document.querySelector('.test').style.backgroundColor="rgb(21, 160, 249)";
}//]]>
</script></head><body><div class="test">test</div></body></html>
The code that inserts the html, css, and js into the iframe:
$(document).ready(function() {
var codeContainer = document.querySelector('.executedCode'); //submitted code passed as values in attributes to hidden div on the page in node/express environment
var html = codeContainer.getAttribute('html');
var css = codeContainer.getAttribute('css');
var js = codeContainer.getAttribute('js');
var sandbox = $('.sandboxed');
sandbox.ready(function() {
var htmlContainer = document.createElement('div');
var cssContainer = document.createElement('style');
var jsContainer = document.createElement('script');
jsContainer.setAttribute('type', 'text/javascript');
var head = sandbox.contents().find('head');
var body = sandbox.contents().find('body');
$(head).append(cssContainer);
$(head).append(jsContainer);
$(html).append(htmlContainer);
$(cssContainer).append('\n\t'+css+'\n');
$(jsContainer).text('//<![CDATA[\nwindow.onload=function(){\n'+js+'\n}//]]>\n');
body.prepend(html);
});
});
window.onload seems to be the breaking point:
I ran a test by simply inserting an alert. Here's what worked and what didn't.
$(jsContainer).text("alert('hi')"); //--works
$(jsContainer).text('//<![CDATA[\nalert("hi");\n//]]>\n'); //-- works
$(jsContainer).text('window.onload=function(){\nalert("hi");\n}\n'); //-- doesn't work

The execution occurs on the line $(jsContainer).text(js) but the html isn't inserted yet.
You just need to move the line $(jsContainer).text(js) after the body.prepend(html) (without the onload event).

Add Class not working even in document ready

the add class is not working even if it is in document ready. I tried both with document ready and without it. it is not working but alert is working.
var $jbanner = jQuery.noConflict(true);
if ($jbanner(window).width() < 1200) {
alert("Less than 1200");
$jbanner("#wad").addClass("hide");
}
other code
var $jbanner = jQuery.noConflict(true);
$jbanner(document).ready(function () {
if ($jbanner(window).width() < 1200) {
alert("Less than 1200");
$jbanner("#wad").addClass("hide");
}
})
With document ready function even the alert box is not working. is it conflicting with other jquerys in the page.
#wad is an image
<img id="wad" src="images/banner.jpg"/>
thanks
ok this should be pretty easy, but not working for me. I just pasted what i am trying to do in a separate html file. but still not working, any help is appreciated.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Untitled Document</title>
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.10.1/jquery.js"></script>
<script type="text/javascript">
var $jbanner = jQuery.noConflict(true);
if ($jbanner(window).width() < 1200) {
alert("Less than 1200");
$jbanner("#wad").addClass("hide");
}
else {
alert("greater than 1200");
$jbanner("#wad").addClass("show");
}
</script>
<style type="text/css">
.hide{display:none;}
.show{display:block;}
</style>
</head>
<body><div>
<div id="wallpaperad">
<a href="http://www.yaho.com" target="_blank" rel="nofollow">
<img src="http://www.placehold.it/160x160" alt="banners" id="wad" /> </a>
</div>
</div>
</body>
</html>
I am just trying to create a floating banner ad. that hides for smaller screen sizes.

Try to without jbanner.
$("#wad").addClass("hide");

RunningPuppy Animated Gif

I am stuck on my homework project. I am supposed to create a JavaScript program that shows an animated puppy running. I must use caching to start the animation as soon as the images finish loading and 1 of the following: getElementsByName(), getElementById(), or getElementsByTagName(). Here's what I have so far but when I run it in Firefox all I see is a flashing box that says "image of a puppy"
<!DOCTYPE HTML>
<html>
<head>
<title>Running Puppy</title>
<meta http-equiv="content-type" content="text/html;
charset=utf-8" />
<script type="text/javascript">
/* <![CDATA[ */
var puppy = new Array(6);
var curPuppy = 0
for (var imagesLoaded=0; imagesLoaded < 6;
++imagesLoaded) {
puppy[imagesLoaded] = new Image();
puppy[imagesLoaded].src
= "images/puppy" + imagesLoaded + ".gif";
}
function run(){
if (curPuppy == 5)
curPuppy = 0;
else
++curPuppy;
document.getElementById("puppyImage").src = puppy[curPuppy].src;
}
/*]]> */
</script>
</head>
<body onload="setInterval('run()', 150)">
<img src="images/puppy0.gif" id="puppyImage" width="263" height="175" alt="Image of a puppy." />
<script type="text/javascript">
/* <![CDATA[ */
document.write("<h1 id='mainHeading'></h1>");
document.getElementById("mainHeading").innerHTML = document.getElementsByTagName("title")[0].innerHTML;
/*]]> */
</script>
</body>
</html>

Print PDF File in IFrame using javascript getting one page only

here is my code to print a pdf file. here while printing time iam getting one page only i need a solution for that
function printPdf(){
var ifr = document.getElementById("frame1");
//PDF is completely loaded. (.load() wasn't working properly with PDFs)
ifr.onreadystatechange = function () {
if (ifr.readyState == 'complete') {
ifr.contentWindow.focus();
ifr.contentWindow.print();
}
}
}

I suspect that's because the whole window gets printed (which has the current view of the iframe with the 1st page of the PDF rendered). Use <object> instead:
<!DOCTYPE html>
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge"/>
<script>
function PrintPdf() {
idPrint.disabled = 0;
idPdf.Print();
}
function idPdf_onreadystatechange() {
if (idPdf.readyState === 4)
setTimeout(PrintPdf, 1000);
}
</script>
</head>
<body>
<button id="idPrint" disabled=1 onclick="PrintPdf()">Print</button>
<br>
<object id="idPdf" onreadystatechange="idPdf_onreadystatechange()"
width="300" height="400" type="application/pdf"
data="test.pdf?#view=Fit&scrollbar=0&toolbar=0&navpanes=0">
<span>PDF plugin is not available.</span>
</object>
</body>
This code is verified with IE. Other browsers will still render the PDF, but may not print it.
[UPDATE] If you need dynamic loading and printing, the changes to the above code are minimal:
<!DOCTYPE html>
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge"/>
<script>
function PrintPdf() {
idPdf.Print();
}
function idPdf_onreadystatechange() {
if (idPdf.readyState === 4)
setTimeout(PrintPdf, 1000);
}
function LoadAndPrint(url)
{
idContainer.innerHTML =
'<object id="idPdf" onreadystatechange="idPdf_onreadystatechange()"'+
'width="300" height="400" type="application/pdf"' +
'data="' + url + '?#view=Fit&scrollbar=0&toolbar=0&navpanes=0">' +
'<span>PDF plugin is not available.</span>'+
'</object>';
}
</script>
</head>
<body>
<button id="idPrint" onclick="LoadAndPrint('http://localhost/example.pdf')">Load and Print</button>
<br>
<div id="idContainer"></div>
</body>

<iframe src="teste.pdf" id="meupdf" width="800" height="600" />
function printPdf) {
var PDF = document.getElementById("meupdf");
PDF.focus();
PDF.contentWindow.print();
}

Develop Reference

JavaScript is the programming language of the Web.

Javascript inside of iframe. Scraping with watir - javascript

Related

How do I get pixels from Canvas created by js-dosbox

Inserting js into iframe not working as expected

Add Class not working even in document ready

RunningPuppy Animated Gif

Print PDF File in IFrame using javascript getting one page only

Categories

Resources