Is there any javascript (and client-side) wget implementation? - javascript

In order to provide a service for webmasters, I need to download the public part of their site. I'm currently doing it using wget on my server, but it introduce a lot of load, and I'd like to move that part to the client side.
Does an implementation of wget exists in Javascript?
If it exists, I could zip the files and send them to my server for processing, that would allow me to concentrate on the core business for my app.
I know some compression library exists in Js (such as zip.js), but I was unable to find the wget counterpart. Do you know something similar?

Does an implementation of wget exists in Javascript?
I doubt it due to the same origin policy restriction built-in browsers which prevents you from fetching contents located on other domains. If the contents is located on your domain and you are not violating the security policy you could use AJAX.

Related

Monitoring web requests made by a specific javascript src file

I'm looking for some help with the following:
I'm building a website that links external scripts (these scripts can be changed anytime by the script owner).
I want to route any web requests that's being sent by these scripts through my backend server as a proxy, so that I can parse the request and response to make sure they are not exfiltrating data from my website.
The idea here is to be able to leverage external scripts that cannot be trusted 100% with my data.
To enforce this, is it possible to intercept web requests made by <script>s from a different <script> that loaded early on?
If not, what is a better way?
If you cannot trust the scripts, this becomes more of a security question. You should get more information about web security in general.
An interesting example of such implementation is Tampermonkey and it's permission model (similar to what Android apps).
Depending on the use-case, you may want to manually approve each js file and enforce it via integrity checking:
https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity
At the end of the day, you cannot programmatically check the security of an external js file. You wither trust its author or you don't.

Requesting HTML page with JavaScript (Angular app)

Python has a module called httplib which allows for the retrieval of an html resource from a URL. With this code:
httpServ = httplib.HTTPConnection("www.google.com")
httpServ.connect()
httpServ.request('GET',"/search?q=python")
...
httpServ.close()
I am trying to do the same thing in my angular app, but using $http get doesn't allow me to retrieve the html document due to the same origin policy.
Is there anything similar to the python method available in JavaScript?
So, the Same-Origin Policy has nothing to do with JavaScript. It basically says "don't allow scripts on a page to talk to scripts being run by another host."
This is an extremely important security feature. It means that if you put jQuery on your page, and somehow a jQuery CDN got hacked and they changed jQuery to send your passwords to another page, it wouldn't work (so long as the browser properly enforces the Same-Origin Policy).
You don't have this problem when working with Python because Python exclusively runs on the server (from a web-app perspective). Your server can talk to any machine it wants to, but browsers do not (and should not as seen above) give that freedom to webpages.
So, how to solve your problem? Make your GET request to a script running on your server. Have your server do a curl or wget or w/e of google.com, then have your server send the data back to the client.

Check server version from javascript

In our development environment we use jetty, in our production we use tomcat.
For some functionality we use javascript but there are some hardcoded locations for the use of jetty or tomcat.
I know it's a bit weird to use two server versions but it just the way it is.
So now when we are building the application, sometimes people forget to change the server version in the javascript file.
Is there a way to automatically check if the server is jetty or tomcat from javascript?
I was thinking of placing an txt file in the root of tomcat and let it check whether or not it exists but maybe there is a way to do it more natively.
Assuming one/both of your servers are sending the Server HTTP header (and Jetty usually does, and can easily be configured to do so), then you could use an XMLHttpRequest and look at the response headers.
Read more here: Accessing the web page's HTTP Headers in JavaScript
However, I would suggest that you extract the pieces of code that change between servers into 1 javascript file. e.g:
/* server_info.js */
locations = {
file1 : "/some/path",
file2 : "/another/path"
};
And include that file as a <script> in all your pages.
Then you can have Jetty and Tomcat each use a different version of that file. It should be easy enough to have a servlet (or filter, or action, or whatever exists in your framework) that looks at the server type and serves up the right file.
If that's too much, then you could do the same thing, but simply have:
/* server_info.js */
server_type = "tomcat";
And vary that file by server (you could easily generate that file in a JSP, or something similar)
Obligatory warning: As I'm sure you know, having different servers in dev and prod is not a fantastic idea, for these sorts of reasons. Once you implement a solution to this problem, how are you going to know that the tomcat code works?
Jetty is more than capable of being a production server, and tomcat can do a good job in development. I suspect you (as a team) are making more work for yourselves than really ought to be.
you could make an ajax request to the server and if each one responds in a unique way, you'll know which is which. whether that's the existence of a different file or different content in a particular file, there are many ways the servers can differentiate themselves to the client.

Download one file, with pieces stored on more than one server (HTTP)

I am working on a file upload system which will store individual parts of large files on more than one server. So the distribution of a 1GB file will look something like this:
Server 1: 0-128MB
Server 2: 128MB-256MB
Server 2: 256MB-384MB
... etc
The intention of this is to allow for redundancy (each part will exist on more than one server), security (no one server has access to the entire file), and cost (bandwidth expenses are distributed).
I am curious if anyone has an opinion on how I might be able to "trick" web browsers into downloading the various parts all in one link.
What I had in mind was something like:
Browser is linked to Server 1, which provides a content-size of the full file
Once 128MB is served, Server 1 will intentionally close the connection
Hopefully, the browser will try to restart the download, requesting Server 1
Server 1 provides a 3XX redirect to Server 2
Browser continues downloading from Server 2
I don't know for certain that my example works, as I haven't tested it yet. I was curious if there were other solutions someone might have?
I'd like to make the whole process as easy as possible (ideally requiring no work beyond a simple download). I don't want the users to have to use another program (ie: cat'ing the files together). I'd also like to not use a proxy server, since it would incur extra bandwidth costs.
As far as I'm aware, there is no javascript solution for writing a file, if there was one, that would be great.
AFAIK this is not possible by using the HTTP protocol. You can probably use a custom browser extension but it would depend on the browser. Another alternative is to create a Java applet that would download the file from different servers. The applet can accept the URLs to the different servers as parameters.
To save the generated file:
https://stackoverflow.com/a/4551467/329062
That solution stores the file in memory though, so it won't work with very large files.
You can download the partial files into a JS variable using JSONP. That will also let you get around the same-origin policy.
Javascripts security model will only allow you to access data from the same origin where the Javascript came from - i.e. not multiple servers.
If you are going to have the file bits on multiple servers, you will need the user to load the web page, fetch the bit and then finally stick the bits together in the correct order. If you can manage to get all your users to do this (correctly), you are a better man than I.
It's possible to do in modern browsers over standard HTTP.
You can use XHR2 with CORS to download file chunks as ArrayBuffers and then merge them using Blob constructor and use createObjectURL to send merged file to the user.
However, I suspect that browsers will store these objects in RAM, so it's probably a bad idea to use it for large files.

XMLHttpRequest cross site scripting?

I realize this issue of cross site scripting has been covered, however being new to web development I had a few further questions.
Currently I am testing an html file I wrote on my PC connecting to a RESTFul web service on another machine. I am getting status=0. Is this considered cross-site scripting?
If a server hosts a file with javascript, and that javascript file has XMLHttpRequests to the server's own web services, will that work, or is that bad?
Apologies if any of these questions are stupid.
status=0 can me a variety of things, and without knowing more about how you got to that point, it is very difficult to determine what, exactly, it means. You could be using an iframe, the other computer could genuinely be telling you that the status is 0... we don't know.
The general rule is that it doesn't matter where the JS is from, it will execute the data where it's loaded. This is what makes the Google js archiving api possible (you know, use https://ajax.googleapis.com/ajax/libs/jquery/1.6.2/jquery.js on a whole assortment of locations). And honestly, that is not a security issue.
The security issue comes in when a js file tries to access another domain (or even subdomain), whether through manipulation of an iframe or through XMLHTTPRequest. It's at that point that the browser will "lay the smackdown" on the script.
You will have difficulty communicating with JavaScript from your hard drive (file:///) to any internet protocol (http|https) because of this.
No, that is not cross site scripting. When including script JS file from another server it is rendered in your site so You won't be able to access through XMLHttpRequest site where JS script is originally located.
If that is possible than anybody who host jQuery file, there are many servers including google, would be opened for XMLHttpRequests.
SO, IT'S NOT POSSIBLE.
If you want JSON response from another server you can use pjson. Google it for more info.
And Cross Site Scripting is when someone injects JavaScript code on your site in order to bypass access control.
You can use CORS for that. You can use the same code you use now, but the other server you request the page from via ajax has to sent the following header on that page
Access-Control-Allow-Origin: http://yoursite.example.com
#or to allow all hosts
Access-Control-Allow-Origin: *

Categories

Resources