WebBrowser source outdated - javascript

I am trying to write an automation tool for a browser game that takes some data from the web page, in this case the data appears to be added using JS after the page has loaded, I assume this is where my issue is.
I'm trying to grab the text that the JS adds and save it to a variable, but when I try and find it using the WebBrowser component's DOM controls, it cannot find the text I need. The text IS there, you can see it on the browser window and the source should easily be found as I can see it when using Chrome's dev console/inspect element tool, when I target it using the DOM controls, VS makes it clear that it can't find it. I am 100% certain I was targeting it right and that I'm not pointing it in the wrong direction.
Is there a way for the WebBrowser to refresh/re-read the source without refreshing the page?
Otherwise, how would you go about working around this?
Cheers,
Tom.

One work around that springs to mind when puling text from something is to use MS Office document imaging. If the text always appears in a specific location on a page it should just be a question of taking a print screen of where the text appears and then running it thought to OCR. The advantage of this is it's pretty future proof, the game makes could change the method by which they display the text but as long as it's displayed you should be able to print screen it. :)

Related

Modify Contents of Editable Div on Twitter Post

I'm writing a chrome extension which helps the user type things on twitter. When writing a tweet on twitter, twitter opens an editable div container. When the user types into it, twitter (which is using some web-framework presumably) generates sub-divs and spans with the text the user types and places them within the content-editable div.
The thing is when one manually changes the span value (for instance, through inspect elements), and then types something again, the value in the span will just revert back to what it previously was (before the inspect elements edit). This is probably because the actually typed string is stored somewhere in javascript, and everything gets overwritten again when the user types into the div.
I've been trying to find a way around this using JQuery but with no success. I don't really know how to start. If it were just a regular input tag, you could call something like $("input").val("new value"), easy-peasy... but I don't know how one could go about doing that for an editable div that gets updated by javascript running somewhere on the page.
For a while, I just thought it would be impossible...
BUT NOW I do know it is possible. If you download the Grammarly extension and use the Grammarly popup-editor (which opens a new window to edit text), then submit that, the twitter editable-content div updates appropriately and everything works like magic.
Sorry if this isn't a standard programming question, but I couldn't find anything on the web that comes close to what I'm trying to do. Maybe I'm just not experienced enough and am missing something really obvious. I tried looking at the twitter and Grammarly source code but it's all minified garbled javascript that I can't read...
Thanks for any help and insight!
EDIT: the twitter url in question is: https://twitter.com/compose/tweet The div in question is the one with contenteditable="true" attribute (you can search it in the inspector)

Get Image URL/Address from webpage selection using Applescript/Javascript

I need a quick way to get the image URL, just like I would get if I right click on an image and select "Copy Image URL". I'm thinking Applescript, though others have mentioned Javascript.
This needs to be compatible with an Automator workflow and needs to work with Google Chrome, Chromium, and Safari, at a minimum.
More specifics:
I already have an Automator workflow that this will be added to.
The workflow begins with text and images that I have selected on a webpage using the mouse.
The processing of the text is working fine.
I just need a Applescript or Javascript or Shell Script (which I assume are the only outside code that can be added to an Automator workflow) that will grab any and all image URL's within the part of the page selected in step 2.
Images are NOT downloaded. Only the image URL is needed.
The basic logic is this:
Does selected input contain images?
If yes,
get URL of image(s)
pass to the next step
else continue
Any help or ideas appreciated!
OS X Services would be your best bet. Those work with text selections and are supported in most apps (e.g. see the Safari>Services submenu). You can also assign them keyboard shortcuts, which is very handy for repetitive tasks.
Basically, you want to get the selection as web content (i.e. HTML data, not plain text) then extract the URLs from that. You can create services in Automator, which includes various actions for working with web content, so I recommend starting there.

How to remove hidden divs from view source of a page?

I have a HTML Page, in which there are some hidden DIVs and these DIVs are visible vai view source of a page. These DIVs should not be visible to a user when they "view source" of the page.
How this can be done ? Perhaps Javascript or other solution?
You can't really prevent a div from being read, because if you do, there will be no render of it.
It can be encrypted and generated via javascript. But once it is generated, user will be able to see it clearly in computed source.
There is no way of doing what you want. The source (in case of HTML) is just text containing HTML markup. The show source view in the browser shows it to you as it came from the server with added syntax highlighting, but unlike the developer tools, it doesn't reflect any DOM changes done with Javascript. Even if some browser had a feature to prevent some parts of the source from being displayed, users will still be able to open it in another browser or download the HTML as a file and examine the source in a text editor.
JavaScript will only change the "computed source" so the client will still be able to see them. In order to really remove them you'll need to remove them server side.
You can not really hide the source code but you can encrypt it. What you transmit from Server to the Client will be in the client side browser and can be seen somehow.
With a tool like the one I just googled http://www.iwebtool.com/html_encrypter it is possible to encrypt html.
It will encrypt your html code and you can insert it via javascript later. Encryption will not finally hide it from someone keen in using debugging tools. But a "normal" user won't see it directly in the source.
Still you should be thinking about storing information you want to hide from the user server-side in a session or something.

altering displayed web page

I am attempting to write a firefox addon that will analyze the displayed page and change the text display to be hyper links (according to some algorithm).
I am trying to fogure out how can i parse the html document tree to retrieve the text in order to make it a link.
So i need not only the text but its position in the document.
Like if i had some kind of parser that will give me only text nodes or something, and then i can replace its content.
Is there such a thing at all?
You can insert javascript into every page so you have everything that javascript can do. A good place to start learning about Firefox addon development is the MDN https://developer.mozilla.org/en/Building_an_Extension

finding text in ajax returns on Javascript console

My javascript framework uses Ajax to dynamically change certain parts of my page. When I use a javascript console like firebug or the one that comes with Chrome and I try to find some tags it seems that the dynamically altered HTML parts are not searched. I will have to hunt them down manually which is a daunting task at times since the framework generates tons of HTML.
The only info I can find about this concerns finding tags programmatically by traversing the DOM but that is not what I'm looking for, I need my debugger to be able to find those tags when I am examining the code at runtime.
Is there a way around this in any browser?
I've created a simple example to demonstrate here
If you open it with Chrome, start the javascript console before clicking on the button and search for the word tag you will find 1 in the original HTML.
Next, click the button. You will see the change. Now search again for the same word tag. It will not be found. However, if you do a find for ta, it will be found. Looks like the search results are buffered someway and not cleared once the page changes.
Firebug doesn't seem to update the page at all.
I found out that if you start Chrome's javascript console after the Ajax refresh, text can be found however, if the Ajax refresh happens when the Javascript console is allready open, searching capabilities within the refreshed Ajax content is limited. I still can't figure out when it does/doesn't work.
Firebug > HTML Panel > mini-menu > Expand Changes
Then your search will work.
Make sure you are appending the ajax content to the document, at least some hidden div. It would be useful if you provide an example.
You can use jquery expressions in the console if you want to find something particular. Something like that: $('#myid'). Of course, you can search not only by id using jquery.

Categories

Resources