How to extract information from web page

How to extract information from web page - javascript

I'm looking for a way to automatically extract information from a web page, more specifically an online game (https://www.virtualregatta.com/fr/offshore-jeu/).
In the game, I want to extract/copy the position of the boat. With Mozilla and its debug tools, I used the network debugger and I saw an HTML POST request containing what I want.
It seems that we receive as a response a json containing a structure with latitude/longitude.
This is perfect to me, but I want a more user friendly way to get it and I would need advices. Problem is that I'm really a beginner in web development haha.
Is it possible to do this using a script ? (But I suppose it will be complicated to first log into the game)
Is it possible to create a basic Mozilla plugin which would be able to catch the request/response and copy the position to clipboard for me ?
anything else ?
EDIT:
I've tried using a Mozilla plugin, and I achieved to add a listener on POST request. I see the request to get the boat information but I can't find a way to get the json response in JS.
function logURL(responseDetails) {
console.log(responseDetails);
}
browser.webRequest.onResponseStarted.addListener(
logURL,
{urls: ["*://*.virtualregatta.com/getboatinfos"]}
);

In Chrome I use Broomo for this purposes. It helps you to add scripts in web pages, you can console.log the POST you found, and of course you can create functions and Use the webpage Backend.
In firefox I found this one js-injector. But I didn't use it before.
Update:
Now there are a new extension for both browsers:
Chrome: ABC JS-CSS Injector
Firefox: ABC JS-CSS Injector

Related

Twitter feed not displaying in Chrome

I have browsed the web trying to find a solution to this problem, many people have suggested disabling avast plugin, add blocker within chrome extensions yet none of these worked.
the url is https://careers.telstra.com/ you will see half way down next to the facebook feed the twitter feed is empty when using chrome, when I view this in IE, FireFox it displays as I would expect.
I've checked the console log in Firefox and I receive no errors, when I go to chrome on the other hand I see the following:
I personally do not think these are related it any way but I thought I would provide as much information as possible to try and get this fixed.
update Turns out the errors are related to google-cast-sdk instead of silently dumping the errors they have decided to dump them straight in to the console. Read more about it here
I've checked and made sure I'm referencing the correct twitter widget.
We build it as follows as pass it to the page
sb.Append("<div class=\"twitterWidget\"><a class=\"twitter-timeline\" href=\"//twitter.com/telstracareers\" data-widget-id=\"345026269295038465\" data-chrome=\"nofooter noscrollbar transparent\" data-tweet-limit=\"3\">Tweets by #telstracareers</a></div>");
The website runs under https, I have tried the following:
href=\"https://twitter.com/telstracareers\"
href=\"//twitter.com/telstracareers\"
Still have no luck, I'm not sure what else I could try any suggestions?
Thanks

Python Selenium get javascript document

I have a webpage that contains some information that I am interested in. However, those information are generated by Javascript.
If you do something similar like below:
browser = webdriver.Chrome()
browser.set_window_size(1000, 1000)
browser.get('https://www.xxx.com') # cannot make the web public, sorry
print browser.page_source
It only print out a few javascript functions and some headers which doesn't contain that information that I want - Description of Suppliers, etc... So, when I try to collect those information using Selenium, the browser.find_element_by_class_name would not find the element I want successfully either.
I tried the code below assuming it would has the same effect as typing document in the javascript console, but obviously not.
result = browser.execute_script("document")
print result
and it returns NULL...
However, if I open up the page in Chrome, right click the element and inspect element. I could see the populated source code. See the attached picture.
Also, I was inspired by this commend that helps a lot.
I could open up the javascript console in Chrome, and if I type in
document
I could see the complete html sitting there, which is exactly what I want. I am wondering is there a way to store the js populated source code using selenium?
I've read some posts saying that it requires some security work to store the populated document to client's side.
Hope I have made myself clear and appreciates any suggestion or correction.
(Note, I have zero experience with JS so detailed explaination would be gratefully appreciated!)

Encryption of the link address <a href="http://www.mycompany.com> so it does not appear in source view on IE toolbar

This is probably a simple question but I can't seem to find what I am looking for on the web so here it goes. I have a link on my company INTRAnet site that senior management does not want the employees to see the actual web address (via the source option on the View tab of IE).
Please let me know how I can do this in HTML, asp.net or JS.
Thanks!
:)

You can't. Tell senior management to quit being so secretive.

Not sure if this is what you want, but here is a similar Question:
php encrypt and decrypt
Does it help at all? There is another, but it is a php code:
http://php.net/manual/en/function.mcrypt-encrypt.php
Also, what language are you looking to implement the code?
Alernatively, you can use this site: http://www.iwebtool.com/html_encrypter and on the box you type your html e.g.
This is your post link
Then use the "Encrypt" button. It will return you the javascript you are looking for.
E.g.
"<"Script Language='Javascript'>
document.write(unescape('%3C%61%20%68%72%65%66%
3D%22%68%74%74%70%3A%2F%2F%73%74%61%63%6B%6F%76%65%72%66%6C%
6F%77%2E%63%6F%6D%2F%70%6F%73%74%73%2F%31%35%39%33%34%36%39%
36%22%3E%54%68%69%73%20%69%73%20%79%
6F%75%72%20%70%6F%73%74%3C%2F%61%3E'));
</Script>
No jsFiddle because that javascript isn't allowed.

First and foremost, it's impossible to hide the url from the browser. The browser has to request the webpage from the server, and even if the url was obscured somehow, it would have to be plaintext in the HTTP Request, which would open it up to a man-in-the-middle utility like Fiddler.
Second, this feels like security through obscurity. Resources that certain people shouldn't have access to should be locked down explicitly, not just hidden because the user doesn't know the url (yet).
However, purely as a thinking exercise... I suppose... you could write a handler that knows the real url, uses code to retrieve the content of the page, and then writes that to the response. So the users would see the handler url, but not where the handler is pulling it's data from. However, you'd then have to go to great lengths to find all links and resources on the page and convert those references to also go through your handler.
Of course, practically speaking, I think this concept is silly. There's some problem your senior management is trying to solve, and hiding the url from the user is not the answer.

If upper management is this secretive then it's a safe bet that you also already have IT people who have browsers locked down as well, meaning Internet Explorer. It's possible that your IT team might be able to force the address bar to hide for all browsers within your company. I don't think that this can be done on a per request basis. Meaning that the address bar would either be on or off all the time.
According to this post your IT team might be able to update the registry to hide the address bar like so:
Run following RegKey:
Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Microsoft\Internet Explorer\ToolBars]
[HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Microsoft\Internet Explorer\ToolBars\Restrictions]
"NoNavBar"=dword:00000001
Here's a google search that might also offer additional information.

Well rather than making it disappear you can make it hard for others to see through and even impossible for those who have no knowledge of base-64. Here is a code :
var a = document.querySelectorAll("*"), b = 0;
for ( b = 0; b < a.length; b ++ ) {
if ( a[b].hasAttribute("data-href") ) {
a[b].href = atob( a[b].getAttribute("data-href") );
};
};
Now you can call something like this :
<a data-href="aHR0cDovL3d3dy5teWNvbXBhbnkuY29t">Go</a>
By using btoa() I converted "http://www.mycompany.com" to "aHR0cDovL3d3dy5teWNvbXBhbnkuY29t" in base-64 and designed "data-href" to understand the encoding. Behind all this it will look and act like :
Go

flashfirebug getting data from actionscript 3 console

My need was to capture data (text data) from flash in a web page.
The data is always changing (wheather data) and this should be exported do a text file so i could manipulate this data.
I tried do this with and my first approach was using a websniffer like fiddler or wireshark.
I used that but could't get data from both because it is embedded in flash.
I used fidler as man-in-midle with wireshark deciphering the data (with the private key from the site cer) but it didn't worked.
After that i tried using flashfirebug pro (the pro allows to run as3 comands in the console). This addon loads the dom tree and refreshes it. After selecting in the page the desired element with inspector (it shows in the left panel the instance and position in the dom) i have acess to the instance properties (and the only one needed is the "html-text" in the right panel).
My problem with this last approach was that it could not communicate with the local file system (if i make "trace(this.text);" in the console it shows the text value but it just shows in the console). The only way to communicate to the file in the hard drive, that i could think of was to throw some error to the log file but could't do that also.
Does anyone have any idea to work with flashfirebug or have some other approach to do this.
Regards,

if you want to work on local filesystems use adobe air.
if you can't, try to work around the browsers sandbox with javascript as bridge to some browser-plugin/-addon which gives you access to local processes and filesystems. to use javascript from flash the ExternalInterface class is your friend.

How do you use the Facebook Graph API in a Google Chrome Extension?

I have been trying to access the information available when using the https://graph.facebok.com/id concept through JSON but have been unable to call or return any information based on different snippets of code I've found around. I'm not sure if I'm using the JSON function correctly or not.
For example,
var testlink = "https://graph.facebook.com/"+id+"/&callback=?";
$.getJSON(testlink,function(json){
var test;
$.each(json.data,function(i,fb){
test="<ul>"+json.name+"</ul>";
});
});
In this code, I am trying to return in the test variable the name. When I use this in a Google Chrome Extension, it just returns a blank page.
Alternatively, I've been also trying to use the Facebook Javascript SDK in my Google Chrome extension, but I am unsure what website I should be using when signing up for an API Key.

I believe that you need to establish either an OAuth session or provide your API key before you can talk to FB. It's been a while since I messed around with FB api but, I'm pretty sure you have to register before you can use the api.
Here's something that might be useful though, it's a javascript console for Facebook which would allow you to test out your code! http://developers.facebook.com/tools/console/

It's an issue with chrome, but I haven't figured out the exact problem. For example open the chrome inspector and type this into it:
$.getJSON("http://graph.facebook.com/prettyklicks/feed?limit=15&callback=?", function(json){console.log(json);});
Then do the same thing in Firefox. Chrome will return nothing, but FF will give you the JSON object. It's not the JSON call itself because if you call something else for instance:
$.getJSON("http://api.flickr.com/services/feeds/photos_public.gne?tags=cat&tagmode=any&format=json&jsoncallback=?", function(data) {console.log(data);});
It will come through normally. So there is some miscommunication between chrome and fb. The really confusing part is that if you browse directly to the graph entry by just pasting it into your address bar it will come through normally that way too.

Develop Reference

JavaScript is the programming language of the Web.