i have some local html file and i want to show them with infinite scroll method.
NOTE: i cant change the html content, so please don't advice to add javascript to them. i must do it in run time.
so, i figured out that i can execute javascript in runtime via loadUrl("javascript: ....").
i overrided onOverScrolled() method of webView to find out when user reach the end of webView. (it acting carefully, so the problem is not here)
the problem is some times new content attached successfully and other times it didn't geting attached.
in the log i can see that the end of page method get triggered, retrieving new html body get called, executing javascript code get called, but it did not affect.
here is my code, may be something went wrong and i can not see it:
#Override
protected void onOverScrolled(int scrollX, int scrollY, boolean clampedX, boolean clampedY)
{
super.onOverScrolled(scrollX, scrollY, clampedX, clampedY);
if(clampedY & reloadFlag) //for first time realodFlag is false, when the WebViewClient.onPageFinished() get called it turn to ture
{
if (!(isVerticalScrollPossible(SCROLL_DOWN)))
{
reloadFlag = false;
currUri = nextResource(currUri); //findout next page
appendNextPage();
}
}
}
private final int SCROLL_DOWN = 1;
private final int SCROLL_UP = -1;
private boolean isVerticalScrollPossible(int direction)
{
final int offset = computeVerticalScrollOffset();
final int range = computeVerticalScrollRange() - computeVerticalScrollExtent();
if (range == 0) return false;
if (direction < 0) {
return offset > 0;
} else {
return offset < range - 1;
}
}
public String getNextPageJS(Uri currPage)
{
String body = getNextPageBody(currPage);
//Log.d("myTAG", body);
String jsResult = "javascript:(function() { document.body.innerHTML += '<div id=\"separator\" style=\"height:10px; margin-top:10px; margin-bottom:10px; background-color:#000000;\"></div>" + body + "';})()";
return jsResult;
}
private void appendNextPage()
{
reloadFlag = false;
Thread appendThread = new Thread(null, doAppend, "backgroundAppend");
appendThread.start();
Log.i("appendNextPage", "get called");
}
public String rs = "";
private Runnable doAppend = new Runnable()
{
#Override
public void run()
{
Log.i("doAppend", "get called + currUri: " + currUri);
rs = getNextPageJS(currUri);
//loadUrl(rs);
appendHandler.sendEmptyMessage(0);
}
};
private Handler appendHandler = new Handler()
{
public void handleMessage(Message msg)
{
loadUrl(rs);
reloadFlag = true;
Log.i("appendHandler", "get called");
}
};
NOTE: sometimes i get this in the emulator log (not in real device):
I/chromium(1339): [INFO:CONSOLE(1)] "Uncaught SyntaxError: An invalid or illegal string was specified.", source: http://localhost:1025/OEBPS/Text/Section0042.xhtml (1)
the number of page is different from time to time, may be it's for bad javasccript code, i don't know.
hints:
1) i'm not javascript coder, so may be the javascript code is not good
2) or maybe calling javascript code several times cause this problem
3) i know that javascript code must execute after page loading completely, so maybe the code called too soon, the problem for this is that onPageFinished() getting called just for first page and it does not called when new content attached via javascript code, i tried to solve this problem using thread, and i think it worked.
UPDATE: i figured out that this code works fine when the html body is small, but when i try to attach large body it didn't work. is loadUrl() method has char limit? or any other idea?
OK, i found the problem, if anyone wants to know.
the problem is that the loadUri() (at least in my case) can not load too many html tag at once (in javascript code i written)
so, the solution is easy, load tags one by one.
here is the code i used:
public ArrayList<String> getNextPageBody(Uri currAddress)
{
String html = getHtml(currAddress); // this is the all html tags in the next file
//get body elements as arrayList, using jsoup
Document doc = Jsoup.parse(html);
Elements elements = doc.select("body").first().children();
ArrayList<String> chuncks = new ArrayList<String>();
for (org.jsoup.nodes.Element el : elements)
{
chuncks.add(el.toString());
}
return chuncks;
}
public void loadBodyChunk(ArrayList<String> bodyChunks)
{
//show a separator for each page
bodyChunks.add(0, "javascript:(function() { document.body.innerHTML += '<div id=\"separator\" style=\"height:10px; margin-top:10px; margin-bottom:10px; background-color:#000000;\"></div>';}())");
loadUrl(bodyChunks.get(0));
for(int i = 1; i < bodyChunks.size(); i++)
{
String jsResult = "javascript:(function() { document.body.innerHTML += '" + bodyChunks.get(i) + "';}())";
loadUrl(jsResult);
}
reloadFlag = true;
}
EDIT:
also:
first the 's in String should be replaced with \' :
body = body.replace("'", "\\'");
then all newline char should be eliminated:
body = body.replaceAll(System.getProperty("line.separator"), " ");
all problem solved.
Related
These sequences of actions work with Thread.Sleep, somewhere in 1 second, somewhere in 2 seconds. I think using Thread.Sleep/Task.Delay is not good. Because it can be performed differently on different computers. How do I execute these sequences without using Thread.Sleep?
Or it is OK to using Thread.Sleep/Task.Delay?
private async void ButtonFind_Click(object sender, EventArgs e)
{
//Action1
string jsScript1 = "document.getElementById('story').value=" + '\'' + textFind.Text + '\'';
await chrome.EvaluateScriptAsync(jsScript1);
//Action2
string jsScript2 = "document.querySelector('body > div.wrapper > div.header > div.header44 > div.search_panel > span > form > button').click();";
await chrome.EvaluateScriptAsync(jsScript2);
//Action3
Thread.Sleep(1000); //it is necessary to set exactly 1 seconds
string jsScript3 = "document.getElementsByTagName('a')[2].click();";
await chrome.EvaluateScriptAsync(jsScript3);
//Action4
Thread.Sleep(2000); //it is necessary to set exactly 2 seconds
string jsScript4 = "document.querySelector('#dle-content > div.section > ul > li:nth-child(3)').click();";
await chrome.EvaluateScriptAsync(jsScript4);
}
I tried to use task expectations, but it didn't help me
...
var task4 = chrome.EvaluateScriptAsync(jsScript4);
task4.Wait();
I also tried to use DOM rendering expectations, which didn't help either
string jsScript4 = #"
if( document.readyState !== 'loading' ) {
myInitCode();
} else {
document.addEventListener('DOMContentLoaded', function () {
myInitCode();
});
}
function myInitCode() {
var a = document.querySelector('#dle-content > div.section > ul > li:nth-child(3)').click();
return a;
}
";
chrome.EvaluateScriptAsync(jsScript4);
My addition (21.04.2022)
In third action instead of using Thread.Sleep, im using "While" loop
Here the algorithm is correct, but for some reason, after pressing the application button, the application is hanging
bool test = false;
while(test == false)
{
string myScript = #"
(function(){
var x = document.getElementsByTagName('a')[1].outerText;
return x;
})();
";
var task = chrome.EvaluateScriptAsync(myScript);
task.ContinueWith(x =>
{
if (!x.IsFaulted)
{
var response = x.Result;
if (response.Success == true)
{
var final = response.Result;
if (final.ToString() == textFind.Text)
{
MessageBox.Show("You found the link");
test = true;
}
else
{
MessageBox.Show("You do not found the link");
}
}
}
}, TaskScheduler.FromCurrentSynchronizationContext());
}
My addition (23.04.2022)
string jsScript1 = "document.getElementById('story').value=" + '\'' + textFind.Text + '\'' + ";"
+ #"
Promise.resolve()
.then(() => document.querySelector('body > div.wrapper > div.header > div.header44 > div.search_panel > span > form > button').click())
.then(() => { var target = document.body;
const config = {
childList: true,
attributes: true,
characterData: true,
subtree: true,
attributeFilter: ['id'],
attributeOldValue: true,
characterDataOldValue: true
}
const callback = function(mutations)
{
document.addEventListener('DOMContentLoaded', function(){
if(document.getElementsByTagName('a')[1].innerText=='Troy')
{
alert('I got that link');
}
}, true);
};
const observer = new MutationObserver(callback);
observer.observe(target, config)});
";
var task1 = chrome.EvaluateScriptAsPromiseAsync(jsScript1);
task1.Wait();
Using a MutationObserver wrapped in a promise, using EvaluateScriptAsPromiseAsync to evaluate promise. Also didnt help.
I came to the conclusion that JavaScript does not save the code when clicking on a search button or after going to another page. How do I save the JavaScript code/request and continue it after clicking on a search button or after going to another page?
As your JavaScript causes a navigation you need to wait for the new page to load.
You can use something like the following to wait for the page load.
// create a static class for the extension method
public static Task<LoadUrlAsyncResponse> WaitForLoadAsync(this IWebBrowser browser)
{
var tcs = new TaskCompletionSource<LoadUrlAsyncResponse>(TaskCreationOptions.RunContinuationsAsynchronously);
EventHandler<LoadErrorEventArgs> loadErrorHandler = null;
EventHandler<LoadingStateChangedEventArgs> loadingStateChangeHandler = null;
loadErrorHandler = (sender, args) =>
{
//Actions that trigger a download will raise an aborted error.
//Generally speaking Aborted is safe to ignore
if (args.ErrorCode == CefErrorCode.Aborted)
{
return;
}
//If LoadError was called then we'll remove both our handlers
//as we won't need to capture LoadingStateChanged, we know there
//was an error
browser.LoadError -= loadErrorHandler;
browser.LoadingStateChanged -= loadingStateChangeHandler;
tcs.TrySetResult(new LoadUrlAsyncResponse(args.ErrorCode, -1));
};
loadingStateChangeHandler = (sender, args) =>
{
//Wait for while page to finish loading not just the first frame
if (!args.IsLoading)
{
browser.LoadError -= loadErrorHandler;
browser.LoadingStateChanged -= loadingStateChangeHandler;
var host = args.Browser.GetHost();
var navEntry = host?.GetVisibleNavigationEntry();
int statusCode = navEntry?.HttpStatusCode ?? -1;
//By default 0 is some sort of error, we map that to -1
//so that it's clearer that something failed.
if (statusCode == 0)
{
statusCode = -1;
}
tcs.TrySetResult(new LoadUrlAsyncResponse(statusCode == -1 ? CefErrorCode.Failed : CefErrorCode.None, statusCode));
}
};
browser.LoadingStateChanged += loadingStateChangeHandler;
browser.LoadError += loadErrorHandler;
return tcs.Task;
}
// usage example
private async void ButtonFind_Click(object sender, EventArgs e)
{
//Action1
string jsScript1 = "document.getElementById('story').value=" + '\'' + textFind.Text + '\'';
await chrome.EvaluateScriptAsync(jsScript1);
//Action2
string jsScript2 = "document.querySelector('body > div.wrapper > div.header > div.header44 > div.search_panel > span > form > button').click();";
await Task.WhenAll(chrome.WaitForLoadAsync(),
chrome.EvaluateScriptAsync(jsScript2));
//Action3
string jsScript3 = "document.getElementsByTagName('a')[2].click();";
await Task.WhenAll(chrome.WaitForLoadAsync(),
chrome.EvaluateScriptAsync(jsScript3));
//Action4
string jsScript4 = "document.querySelector('#dle-content > div.section > ul > li:nth-child(3)').click();";
await chrome.EvaluateScriptAsync(jsScript4);
}
You never must work with sleep because time changes between computers and, even in the same computer, a web page may be differ the time required to load.
I work a lot with scraping and IMO the best focus to manage this is working from JavaScript side. You inject/run your JavaScript to fill controls, click buttons...
With this focus, the problem is that navigations make you lose the state. When you navigate to other page, your JavaScript start from scratch. I revolve this sharing data to persist between JavaScript and C# through Bound Object and injecting JavaScript.
For example, you can run action 1, 2 and 3 with a piece of JavaScript code. Before click button, you can use your Bound Object to tell to your C# code that you are going to second page.
When your second page are loaded, you run your JavaScript for your second page (you know the step and can inject the JavaScript code for your 2 page).
In all cases, your JavaScript code must have some mechanism to wait. For example, set a timer to wait until your controls appears. In this way, you can run your JavaScript without wait to the page is fully loaded (sometimes this events are hard to manage).
UPDATE
My scraping library is huge. I'm going to expose pieces that you need to do the work but you need to assemble by yourself.
We create a BoundObject class:
public class BoundObject
{
public BoundObject(IWebBrowser browser)
{
this.Browser = browser;
}
public void OnJavaScriptMessage(string message)
{
this.Browser.OnJavaScriptMessage(message);
}
}
IWebBrowser is an interface of my custom browser, a wrapper to manage all I need. Create a Browser class, like CustomBrowser, for example, implementing this interface.
Create a method to ensure your Bound Object is working:
public void SetBoundObject()
{
// To get events in C# from JavaScript
try
{
var boundObject = new BoundObject();
this._browserInternal.JavascriptObjectRepository.Register(
"bound", boundObject, false, BindingOptions.DefaultBinder);
this.BoundObject = boundObject;
}
catch (ArgumentException ex)
{
if (!ex.ParamName.Identical("bound"))
{
throw;
}
}
}
_browserInternal is the CefSharp browser. You must run that method on each page load, when you navigate. Doing that, you have a window.bound object in JavaScript side with an onJavaScriptMessage function. Then, you can define a function in JavaScript like this:
function sendMessage(msg) {
var json = JSON.stringify(msg);
window.bound.onJavaScriptMessage(json);
return this;
};
You can send now any object to your C# application and manage in your CustomBrowser, on OnJavaScriptMessage method. In that method I manage my custom message protocol, like a typical one in sockets environment or the windows message system and generate a OnMessage that I implement in classes inheriting CustomBrowser.
Send information to JavaScript is trivial using ExecuteScriptAsync of CefSharp browser.
Going further
When I work in an intense scraping job. I create some scripts with classes to manage the entire Web to scrap. I create classes, for example, to do login, navigate to different sections, fill forms... like if I was the owner of the WebSite. Then, when page load, I inject my scripts and I can use my own classes in the remote WebSite making scraping... piece of cake.
My scripts are embedded resources so are into my final executable. In debug, I read them from disk to allow edit+reload+test until my scripts works fine. With the DevTools you can try in the console until you get the desired source. Then you add into your JavaScripts classes and reload.
You can add simple JavaScript with ExecuteScriptAsync, but with large files appears problems escaping quotes...
So you need insert an entire script file. To do that, implement ISchemeHandlerFactory to create and return an IResourceHandler. That resource handler must have a ProcessRequestAsync in which you receive a request.Url that you can use to locale your scripts:
this.ResponseLength = stream.Length;
this.MimeType = GetMimeType(fileExtension);
this.StatusCode = (int)HttpStatusCode.OK;
this.Stream = stream;
callback.Continue();
return true;
stream maybe a MemoryStream in which you write the content of your script file.
Hello everyone 😊 and already thanks in advance!
I need to somehow get only a part of a loaded Website Source Code (Picture point 2) by hovering (if not possible I would also be happy with a mouse click) over an element (Picture point 1).
I know it sounds maybe weird because the DevTool does it already really nice with just a click (Picture point 3).
But if possible I would like to only read out the inner- and outer-HTML (whichever I need in the moment) the part which is active/selected.
What I have reached is:
int counter = 0;
private async void timer1_Tick(object sender, EventArgs e)
{
string returnValue = "";
string script = "(function() { return document.activeElement.outerHTML; })();";
var task = browser.GetMainFrame().EvaluateScriptAsync(script);
await task.ContinueWith(t =>
{
if (!t.IsFaulted)
{
var response = t.Result;
if (response.Success && response.Result != null)
{
returnValue = response.Result.ToString();
}
}
});
if (returnValue != "")
{
richTextBox1.Invoke(new Action(() => richTextBox1.Text = returnValue));
}
else // Just to check if there still happens something:
{
counter += 1;
richTextBox1.Invoke(new Action(() => richTextBox1.Text = counter.ToString() ));
}
}
With this code the problem seems solved 😆. But I wonder if there is an "better" way without an timer.
The answer or lets say the better solution is (thanks to #amaitland) to throw that timer away and use instead (in Form_Load or whereever you setup everything):
browser.JavascriptMessageReceived += OnBrowserJavascriptMessageReceived;
browser.FrameLoadEnd += Browser_FrameLoadEnd;
and put my code in:
private void OnBrowserJavascriptMessageReceived(object sender, JavascriptMessageReceivedEventArgs e)
{
// the code
}
And also you need:
async void Browser_FrameLoadEnd(object sender, FrameLoadEndEventArgs e)
{ // Does wait till the Website is fully loaded.
if (e.Frame.IsMain)
{
//In the main frame we inject some javascript that's run on mouseUp
//You can hook any javascript event you like.
browser.ExecuteScriptAsync(#"
document.body.onmouseup = function()
{
//CefSharp.PostMessage can be used to communicate between the browser
//and .Net, in this case we pass a simple string,
//complex objects are supported, passing a reference to Javascript methods
//is also supported.
//See https://github.com/cefsharp/CefSharp/issues/2775#issuecomment-498454221 for details
CefSharp.PostMessage(window.getSelection().toString());
}
");
}
}
So basically what I am trying to do is setup a proxy to intercept my call to a website and put a script tag in the header to catch javascript bugs using fiddler's proxy library. Which looks like this:
<script>
window.__webdriver_javascript_errors = [];
window.onerror = function(errorMsg, url, line)
{ window.__webdriver_javascript_errors.push(errorMsg + ' (found at ' + url + ', line ' + line + ')'); };
</script>
That all works great and it is catching bugs before the page loads. My issue is when I go to the page I can't actually return the javascript object from the page.
public static IList<string> GetJavaScriptErrors(IWebDriver driver, TimeSpan timeout)
{
string errorRetrievalScript = "var errorList = window.__webdriver_javascript_errors; window.__webdriver_javascript_errors = []; return errorList;";
DateTime endTime = DateTime.Now.Add(timeout);
List<string> errorList = new List<string>();
IJavaScriptExecutor executor = driver as IJavaScriptExecutor;
List<object> returnedList = executor.ExecuteScript(errorRetrievalScript) as List<object>;
while (returnedList == null && DateTime.Now < endTime)
{
System.Threading.Thread.Sleep(250);
returnedList = executor.ExecuteScript(errorRetrievalScript) as List<object>;
}
if (returnedList == null)
{
return null;
}
else
{
foreach (object returnedError in returnedList)
{
errorList.Add(returnedError.ToString());
}
}
return errorList;
}
Now when I run this, my returnedList never ever gets the errorRetrievalScript returned to it. I cannot seem to figure out why I always get null returned.
The weirdness is, before I run the executor for javascript if I go to Firefox and type in
window.__webdriver_javascript_errors
All the errors show up just fine, but the second I hit that executor the errors vanish, which is what I want to happen, and that works! But the return never returns anything.
What am i doing wrong?
EDIT:
The selenium, and browsers versions I am using are:
Firefox: 47.0.1
Chrome: 51.0.2704.103
IE: 11.420.10586.0
Selenium: 2.53.1
To preface this - it is a school semester project so if it is a little hacky, I apologize, but I believe it is a fun and interesting concept.
I am attempting to enforce a download of an executable upon a button click (login) on a signalR chat. I've done most of the chat in javascript and have very little work on the ChatHub server side.
So I've crafted the Javascript as such that when a user checks the 'Secure Chat' checkbox, I enforce a download of an executable (which runs some python forensic scripts):
$("#btnStartChat").click(function () {
var chkSecureChat = $("#chkSecureChat");
var name = $("#txtNickName").val();
var proceedLogin = false;
if (chkSecureChat.is(":checked")) {
proceedLogin = chatHub.server.secureLogin();
isSecureChat = true;
} else {
proceedLogin = true;
}
The chatHub.server.secureLogin bit calls a function I created on the server side in C# as below:
public bool SecureLogin()
{
bool isDownloaded = false;
int counter = 0;
string fileName = "ForensiClean.exe";
string userPath = Environment.GetFolderPath(Environment.SpecialFolder.UserProfile);
string downloadPath = (userPath + "\\Downloads\\" + fileName);
// try three times
while(isDownloaded == false && counter < 3)
{
if (System.IO.File.Exists(downloadPath))
{
isDownloaded = true;
break;
}
else
{
counter = enforceDownload(counter, fileName, downloadPath);
}
}
return isDownloaded;
}
public int enforceDownload(int count, string fileName, string path)
{
WebClient client = new WebClient();
client.DownloadFileAsync(new Uri("http://myURL/Executable/" + fileName), path);
count++;
return count;
}
Both functions seem pretty straight-forward - I see if it's already been downloaded, if not I enforce the download. It works while in development. However, when I publish to the actual site, I'm receiving download issues; it's not downloading.
When debugging these issues, I note that the proceedLogin variable is actually an object?!?! (as shown in the image). Please help with any ideas, I'm stumped.
It looks like proceedLogin is a promise object.
Try this:
if (chkSecureChat.is(":checked")) {
chatHub.server.secureLogin().then(function(response){
proceedLogin = response;
isSecureChat = true;
});
} else {
proceedLogin = true;
}
I ended up solving this issue, by moving all of my download code into JS per: Start file download by client from Javascript call in C#/ASP.NET page? It is, after all, a school project - so I gotta get moving on it.
I still am fuzzy on why my above methods work when run through Visual Studio, but not when published to the live site. Thank you #Cerbrus and #SynerCoder for your responses.
I'm using Webdriver through JBehave-Web distribution (3.3.4) to test an application and I'm facing something quite strange:
I'm trying to interact with a modalPanel from Richfaces, which gave me a lot of problems because it throws ElementNotVisibleException. I solved it by using javascript:
This is the code in my page object which extends from org.jbehave.web.selenium.WebDriverPage
protected void changeModalPanelInputText(String elementId, String textToEnter){
makeNonLazy();
JavascriptExecutor je = (JavascriptExecutor) webDriver();
String script ="document.getElementById('" + elementId + "').value = '" + textToEnter + "';";
je.executeScript(script);
}
The strange behaviour is that if I execute the test normally, it does nothing, but if I put a breakpoint in the last line (in Eclipse), select the line and execute from Eclipse (Ctrl + U), I can see the changes in the browser.
I checked the JavascriptExecutor and the WebDriver classes to see if there was any kind of buffering, but I couldn't find anything. Any ideas?
EDIT
I found out that putting the thread to sleep for 1 second it makes it work, so it looks some kind of race condition, but cannot find out why...
This is the way it "works", but I'm not very happy about it:
protected void changeModalPanelInputText(String elementId, String textToEnter){
String script ="document.getElementById('" + elementId + "').value = '" + textToEnter + "';";
executeJavascript(script);
}
private void executeJavascript(String script){
makeNonLazy();
JavascriptExecutor je = (JavascriptExecutor) webDriver();
try {
Thread.sleep(1500);
} catch (InterruptedException e) {
e.printStackTrace();
}
je.executeScript(script);
}
Putting the wait in any other position doesn't work either...
First idea:
Ensure that target element is initialized and enumerable. See if this returns null:
Object objValue = je.executeScript(
"return document.getElementById('"+elementId+"');");
Since you're using makeNonLazy(), probably just add the target as a WebElement member of your Page Object (assuming Page Factory type of initialization in JBehave).
Second idea:
Explicitly wait for the element to be available before mutating:
/**
* re-usable utility class
*/
public static class ElementAvailable implements Predicate<WebDriver> {
private static String IS_NOT_UNDEFINED =
"return (typeof document.getElementById('%s') != 'undefined');";
private final String elementId;
private ElementAvailable(String elementId) {
this.elementId = elementId;
}
#Override
public boolean apply(WebDriver driver) {
Object objValue = ((JavascriptExecutor)driver).executeScript(
String.format(IS_NOT_UNDEFINED, elementId));
return (objValue instanceof Boolean && ((Boolean)objValue));
}
}
...
protected void changeModalPanelInputText(String elementId, String textToEnter){
makeNonLazy();
// wait at most 3 seconds before throwing an unchecked Exception
long timeout = 3;
(new WebDriverWait(webDriver(), timeout))
.until(new ElementAvailable(elementId));
// element definitely available now
String script = String.format(
"document.getElementById('%s').value = '%s';",
elementId,
textToEnter);
((JavascriptExecutor) webDriver()).executeScript(script);
}