I know this question sounds really suspicious, but I'm in a weird situation where I've taken control of a project, but the original vendor that built the website retains ownership of the server and the server-side code.
I've successfully scripted a lot of my interactions with the website through Selenium, but one of the activities I need to script is to submit files through a Silverlight form. From what I understand, you can't interact directly with Silverlight from Python or with Selenium, but looking at the source code it looks like the Silverlight is just used to get a file location and display a loading bar, and the actual form submission is done via post to an ASP server-side script. But it has several hidden validation fields.
So I'm kind of envisioning a path where I navigate to this page with Selenium, then parse that page for the validation values, and then submit the form with those values and the data that I want to upload.
Is this approach viable? Where can I find information on something like this and the difficulties involved?
Silverlight can be automated via Windows accessibility interfaces. The scope of this is too big for a stackoverflow post, but I suggest you have a look at this: https://msdn.microsoft.com/library/ms727247(v=vs.100).aspx
You can drill use UI Spy to drill down to control IDs and then use Python's COM interop to call the UI Automation APIs.
Related
I was going through page source of famous websites like twitter, instagram, snapchat etc. One thing I found common in all these websites is that there are no input tags in the page source of login/signup page. Infact there is no form tag at all. I wonder if its a security measure or something else. Can anyone answer me this?
It is a side-effect of an overreliance of client-side programming without server-side rendering as a fallback.
It has nothing to do with security whatsoever.
In the early days of the web, all was present in the HTML code. There was little dynamic on those pages.
Through history the pages became more and more dynamic, with more JavaScript. Some document elements were created through JavaScript, in response to some user events (like adding rows to a table). This became a pattern that extended also to the initial build up of the page: some popular libraries are completely built around that idea. In the extreme, the HTML only contains one script tag (besides the html and body elements), and the JavaScript does the whole job of generating the page, based on some other configuration (in a database, configuration files, JSON, ...). It takes away the HTML design, and moves this responsibility to JavaScript.
This is not a security concern. Just one of the ways some frameworks work.
The browser's page source only displays what is rendered on the server. We call this Server Side Rendering.
These days, with the concept of Client Side Rendering, most of the webpage is generated on the client side, with JavaScript frameworks like React.js. The server just generates the basic skeleton HTML.
So, to answer your question: Why are there no input tags in the page source of login/signup page - This is because these input tags are generated at the client-side using Javascript. This does not mitigate any security risks, as the input tags are still accessible via the DOM inspector
It's not bad, it's just not the way those sites work. They likely use a javascript front end app and POST data to a back end API of some sort to do the login.
Input tags hark from an older variation of the web where the server prepared a page, sent it to the browser, the browser user filled in form fields, sent the data to the server as an encoded form, the server prepared a new block of html and sent ti to the browser. You'd see the page refresh with different content and huge amounts of html was flying round all the time. Some tricks were employed by browsers to make it look like a more seamless experience but there is stil lthe underlying method of working, that the whole page was replaced
The modern web tends towards single page applications; a page loads with a javascript application and the script manipulates the browser document to draw the UI, there is never again a situation where the server is wholesale shipping entire pages to the browser for display. It's all "script sends some data, probably json, to the server, gets a response and programmatically updates the local document making the browser change what it displays"
To this end somewhat the browser has become an operating system or development environment, with programming features, and access to local resources like webcams and file, and it forms a front end UI with the back end server holding and processing the data that makes the application valuable (worth purchasing or using). It's not about security, but instead about making a modern, dynamic and usable UI for a web-based application. There's no reason why they could't use an input tag; the javascript app running in the browser could create an input box, the user could type into it, the javascript app could pull the value out of it and send it to the server (not in a form) and get the response.. It's just that they don't need to work that way; there's a lot more freedom to get user input in various ways, send it to the server in various ways and act on the response. We are no longer tied to that older "post a form to the server" way of having dynamic content, and we haven't been tied for a long time but it's relatively more recently that really good frameworks and libraries for creating these single page apps have come along, so increasingly we see sites using them for the benefits they provide and they might bring with them ways of garnering user input without using an input tag - ways that are not subject to some limitation (probably styling) that using the tag incurs.
The modern web is mainly centred on end user experience with a more rich and fluid ui, less data being transferred for faster response times etc. All this chat between front end and back end (should) happens over HTTPS in the modern web (the CPU cost of encrypting being relatively low in these days of high powered servers and clients) so it's at least as secure as it ever was.
TLDR; input tags are less often used because they're less often needed and may bring some problems and blockers to the modern ways of developing. They aren't inherently insecure.
I am developing a web application using Facelets and Entity-Controller-EJB structure. In the application, there are contents which are reachable only if you are logged. A bean checks the login state every time you click on a button/link for the restricted contents, and redirects you either to the selected page or to the login page.
I thought that this way is not safe, as you can write the link directly in the browser instead of generating it from a button that checks the bean. So what should I do? Is there a render option embeddable in each page or should I write a javascript function? In this case, what should I do? I have studied js fundamentals but don't really know how to implement this control!
Thank you for reading!
You cannot rely only on frontend to deny access to some parts of a web application.
This because all the HTML/CSS/Javascript is downloaded on users' browser, so they can read your code and your authentication mechanism, and understand how to bypass it (or just disable it).
More on this: Why are browsers allowed to display client-side source code?
What you need is implementing some security mechanism in the backed.
The simplest one is to delegate this to your webserver (here the instructions for Apache) and then use something similar to this to do login.
Another way is to have a proper backend: you send data to it (email/password) and it provides you a token that you use to access protected resources.
Or also, dinamically create your documents on server side, only if the user is authenticated.
I'm having issues trying to figure out how to generate on server side a PDF from a javascript-heavy webpage that is served from Tomcat (the application is Pentaho CE). The content is a dashboard that responds to user interaction. Pentaho (the application) replaces divs dynamically with various content through AJAX calls. I'd like to export to pdf whatever state the user has the dashboard at. There are no restrictions on what I can put on the server, but I need to avoid having the client install anything.
I've taken a look at this, along with a bunch of other google-fu:
JSP/HTML Page to PDF conversion
wkhtmltopdf seems to be a popular choice; before I start banging my head against it, I have a few questions:
Can wkhtmltopdf handle going to password protected jsps where authentication is handled by the application? Would the dynamically loaded divs break it?
Is there a way to perhaps return the client view to the server for processing? I read about screen capturing...
Another option that could work out would be to automate a local access to the dashboard on the server through a server-hosted web browser and generate a PDF that way...is this possible, given the constraints of Tomcat and password protection that's handled by the application? The javascript components that Pentaho generates cannot be accessed outside of the application.
Thanks!
EDIT:
Good news! wkhtmltopdf works! Kind of. I got past the password authentication through putting the login details through a query string, and I'm getting a pdf of the correct page now. The issue is that no javascript components are showing up... (they work for pages like yahoo.com, so maybe I'm missing something here).
If you have a lot of AJAX calls you should wait for them. Use the --javascript-delay x argument, where is x is the time to wait.
Using Python, I built a scraper for an ASP.NET site (specifically a Jenzabar course searching portlet) that would create a new session, load the first search page, then simulate a search by posting back the required fields. However, something changed, and I can't figure out what, and now I get HTTP 500 responses to everything. There are no new fields in the browser's POST data that I can see.
I would ideally like to figure out how to fix my own scraper, but that is probably difficult to ask about on StackOverflow without including a ton of specific context, so I was wondering if there was a way to treat the page as a black box and just fire click events on the postback links I want, then get the HTML of the result.
I saw some answers on here about scraping with JavaScript, but they mostly seem to focus on waiting for javascript to load and then returning a normalized representation of the page. I want to simulate the browser actually clicking on the links and following the same path to execute the request.
Without knowing any specifics, my hunch is that you are using a hardcoded session id and the web server's app domain recycled and created new encryption/decryption keys, rendering your hardcoded session id (which was encrypted by the old keys) useless.
You could try using Firebugs NET tab to monitor all requests, browse around manually and then diff the requests that you generate with ones that your screen scraper is generating.
If you are just trying to simulate load, you might want to check out something like selenium, which runs through a browser and handles postbacks like a browser does.
I am looking to create a web widget that can be easily integrated into any website using javascript and posts a form to my server, returns the data and displays the results appropriately. This will all happend in a small area of the host websites screen, like google adsense. I am aware that this is XSS and also the cross domain issues with ajax.
What I need help with is cementing the flow of such a widget. Has anyone done anything like this before?
The general process is:
Website embeds javascript - external js
Javascript renders a form
User submits form with POST data
POST data is sent to external server
Server responds and updates widget to display tabular data
Is this possible? How could it be achieved? Should I use / avoid using a JS framework such as Prototype / JQUery for this project?
If anyone knows any good tutorial resources for building widgets that would be great.
Any help would be greatly appreciated.
using a library like jQuery is not possible, since you don't know whether the website that uses your widget has the jQUery-library referenced.
If you use an iframe, and show something of your own host, you are able to use a library if I'm not mistaken.