XXE Vulnerabilities in incoming XML strings in Javascript code prased in Java

XXE Vulnerabilities in incoming XML strings in Javascript code prased in Java - javascript

I have an application that manages APIs.
As a part of creation of an API we allow users to enter some JavaScript that will be executed every time the API is hit.
This JavaScript is executed on the server side so the flow is -
End user hits API link generated by me
I run the JavaScript entered at API creation time
I forward the request to wherever
I return the result to the front end
The intended use case is to set some request headers and the like.
Now, we recently had a security audit and this of course opens the door to an XXE vulnerability -
var x='<?xml version="1.0"?><!DOCTYPE foo [ <!ELEMENT foo ANY ><!ENTITY lol SYSTEM "file:///etc/xxxx" >]><foo>&lol;</foo>';
var xee = new javascript.ScriptableDocument(x);
context.setVariable("request.queryparam.foo",xee.toString())
I will have this entire content body in Java but how do I block against XEE vulnerabilities? I can imagine I'd have to run through the incoming JavaScript look for any XML and use one of Java's well known XEE stripping methods (described here excellently).
But the persistent hacker can simply just befuddle any attempts to identify JavaScript XML identification on my part.
Example -
var a='<', b="?" c="x";
new javascript.ScriptableDocument(a+b+c+...);
Is this an unwinnable fight? Or is there something super obvious I can do to mitigate this?
Thanks!
Zulfi

Related

Embeddable js interpreter for user's code?

Imagine website, where user can generate content via js.
For example.
User clicks button
It requests our api (not user's api)
Api returns object with specific fields.
We show select with user's defined options generated by user's code or some calculated result based on data we sent.
The idea is to give user an ability to edit visible content (using our structures, we know beforehand which fields in returned object do what things).
First solution "developed" in 5 minutes.
Users clicks button
It send all required data as context to our api.
We fetch from database user's defined code
// here is the code which we write (not user) and we know this code is safe
const APP_CONTEXT = parseInput(); // this can be parameters from command line
const ourLibrary = require('ourLibrary');
// APP_CONTEXT is variable which contains data from frontend. We control data inside APP_CONTEXT, user can not write to it
// here is user defined code
const someVar = APP_CONTEXT['fieldDescribedInOurDocumentation'];
const anotherVar = APP_CONTEXT['anotherFieldFromDocumentation'];
ourLibrary.sendToFrontend(someVar + anotherVar);
In this very simple example once user clicked on button, we sent api request to our api, user's code has been executed, we show result of execution. ourLibrary abstract the way the handling is completed.
The main problem as I think is the security. I think about using restricted nodejs process. No network access, no file system access.
Is it possible to deny any import/require in nodejs process? I want to let user only call all builtin js function (Math.min, Math.max, new Date(), +, -), declare functions and so on. So it will work like a sophisticated calculator. And also we should have an ability to send it back to frontend. For example, via rabbitmq + nodejs + websockets. We can use simple console.log if former is the problem.
Some possible solution (not secure, of course) using nodejs interpreter. We execute interpreter every time when action is required.
const APP_CONTEXT = parseInput();
const ourLibrary = require('ourLibrary');
const usersCode = getUsersCode();
eval(usersCode);
Inside usersCode they use ourLibrary.sendToFrontend to produce the result. But this solution allows user to use any builtin nodejs functions, like const fs = require('fs'). Of course access will be restricted using linux system (selinux or similar) but can I configure/setup nodejs to run as simple js interpreter? May be there is some other js interpreter exists which is safe to use? Safe means: only arithmetic, Date function, Math functions and so on. No filesystem access, no network access.

Passing URL Arguments and HTML Output Order of Execution

This is directly related to the answer in Google Apps Script - possible charts types.
I am trying to extend the top answer by deploying it as a webapp instead of an add-on, and also to pass URL arguments to the app script.
Everything is exactly the same as the linked example above, except that I stripped out the addon code and put in the most basic webapp code by adding a doGet(e) function.
/*
//if I manually specify the values in the script, it works fine
var sheetRange = "A1:D20"; // standard range to gather data
var sheetTabName = "Sheet1"; //name of the tab in the spreadsheet to look for. must be unique
var spreadsheetId = '1CKQTQYXgt3YgnUXu0YHFeMcG5sMh99sj293oKRFVp4M'; //spreadsheet ID
*/
var sheetRange;
var sheetTabName;
var spreadsheetId;
function doGet(e) {
//but if I try to load the arguments from the URL, it doesn't work
//these values never get set here
sheetRange = e.parameter.sheetRange;
sheetTabName = e.parameter.sheetTabName;
spreadsheetId = e.parameter.spreadsheetId;
Logger.log("This never gets run %s %s %s",sheetRange,sheetTabName,spreadsheetId );
//but this template gets made
var template = HtmlService.createTemplateFromFile('BubbleEx')
.evaluate()
.setSandboxMode(HtmlService.SandboxMode.IFRAME)
.setWidth(800)
.setHeight(600);
Logger.log("Why doesn't this get printed at least?");
//and returned
return template;
}
function getSpreadsheetData() {
Logger.log("This does get run!\nSpreadsheetId is: %s\nSheetRange is: %s\nSheetTabName is: %s",spreadsheetId,sheetRange,sheetTabName);
var sheet = SpreadsheetApp.openById(spreadsheetId);
var data = sheet.getSheetByName(sheetTabName).getRange(sheetRange).getValues();
return (data.length > 1) ? data : null;
}
Clearly I'm missing something fundamental about the order of execution here. Something about the way the HTML is interacting with the script is causing it to be completed before certain parts of the code.gs complete. I'm really new to using GAS as a deployed webapp, so any/all help is greatly appreciated. Thanks!
Here's the preformatted link (with the included arguments) I'm trying to use. The sheet is publicly viewable with a link:
https://script.google.com/a/macros/edmonton.ca/s/AKfycbxMbCG3p-zdoJReIS6jRHnLK3J-XsI1Zm_BFvfz_UQ/dev?spreadsheetId=1CKQTQYXgt3YgnUXu0YHFeMcG5sMh99sj293oKRFVp4M&sheetTabName=Sheet1&sheetRange=A1%3AD20

Visualization in a Web App
I expanded on the linked question in a blog post a while back, with a dashboard example as a web app. (The code for the dashboard is in the blog, I won't bother repeating it here.)
Logger usage
Comments you've left in your code imply that the conclusions you're making about what-has-run-when is based on whether or not logs have shown up. If only it was that easy!
Unfortunately, the Logger is an unreliable tool when used for debugging a web app or other asynchronous operations. Surprise! It's the subject of another blog post of mine.
The Logger can be extended by using the BetterLog library and a few simple utility functions, so that you can generate logs from the client side as well as from asynchronous server-side calls.
Why aren't those globals working?
Order of execution isn't the issue - rather it's about how global variables behave between execution instances.
When you've set spreadsheetId in your doGet() function, its content is available to the whole script, but only for the duration of that instance's execution. In the following diagram, I've illustrated the communication between a few of the pieces of your solution. Each asynchronous call to a Google Apps Script function creates a new, independent execution instance of your script. Each instance has its own copy of the script's global variables.
The upshot of this is that the spreadsheetId value you set in doGet() isn't available to getSpreadsheetData() when it is invoked by the google.script.run call in the client-side JavaScript. The variable exists as a symbol only - it isn't always the same piece of computer memory. (It might not even be on the same physical computer.)
If you want to "set" some "global" variables to survive between instances, you can use a persistent storage method such as the Properties Service. In your example, though, you would want to be careful with this; if two users were accessing the Web App at the same time, the last one in would over-write values previously set by the earlier user.
A more appropriate way to handle this would be to explicitly pass the "globals" via the html template. (If you create a new Google Apps Script using the demo "Web App" template, you'll see an example of this.)

Is google apps script synchronous?

I'm a Java developer learning JavaScript and Google Apps Script simultaneously. Being the newbie I learned the syntax of JavaScript, not how it actually worked and I happily hacked away in Google Apps Script and wrote code sequentially and synchronous, just like Java. All my code resembles this: (grossly simplified to show what I mean)
function doStuff() {
var url = 'https://myCompany/api/query?term<term&search';
var json = getJsonFromAPI(url);
Logger.log(json);
}
function getJsonFromAPI(url) {
var response = UrlFetchApp.fetch(url);
var json = JSON.parse(response);
return json;
}
And it works! It works just fine! If I didn't keep on studying JavaScript, I'd say it works like a clockwork. But JavaScript isn't a clockwork, it's gloriously asynchronous and from what I understand, this should not work at all, it would "compile", but logging the json variable should log undefined, but it logs the JSON with no problem.
NOTE:
The code is written and executed in the Google Sheet's script editor.
Why is this?

While Google Apps Script implements a subset of ECMAScript 5, there's nothing forcing it to be asynchronous.
While it is true that JavaScript's major power is its asynchronous nature, the Google developers appear to have given that up in favor of a simpler, more straightforward API.
UrlFetchApp methods are synchronous. They return an HttpResponse object, and they do not take a callback. That, apparently, is an API decision.

Please note that this hasn't really changed since the introduction of V8 runtime for google app scripts.
While we are on the latest and greatest version of ECMAScript, running a Promise.all(func1, func2) I can see that the code in the second function is not executed until the first one is completed.
Also, there is still no setTimeout() global function to use in order to branch the order of execution. Nor do any of the APIs provide callback functions or promise-like results. Seems like the going philosophy in GAS is to make everything synchronous.

I'm guessing from Google's point of view, that parallel processing two tasks (for example, that simply had Utilities.sleep(3000)) would require multiple threads to run in the server cpu, which may not be manageable and may be easy to abuse.
Whereas parallel processing on the client or other companies server (e.g., Node.js) is up to that developer or user. (If they don't scale well it's not Google's problem)
However there are some things that use parallelism
UrlFetchApp.fetchAll
UrlFetchApp.fetchAll() will asynchronously fetch many urls. Although this is not what you're truly looking for, fetching urls is a major reason to seek parallel processing.
I'm guessing Google is reasoning this is ok since fetchall is using a web client and its own resources are already protected by quota.
FirebaseApp getAllData
Firebase I have found is very fast compared to using a spreadsheet for data storage. You can get many things from the database at once using FirebaseApp's getAllData:
function myFunction() {
var baseUrl = "https://samplechat.firebaseio-demo.com/";
var secret = "rl42VVo4jRX8dND7G2xoI";
var database = FirebaseApp.getDatabaseByUrl(baseUrl, secret);
// paths of 3 different user profiles
var path1 = "users/jack";
var path2 = "users/bob";
var path3 = "users/jeane";
Logger.log(database.getAllData([path1, path2, path3]));
}
HtmlService - IFrame mode
HtmlService - IFrame mode allows full multi-tasking by going out to client script where promises are truly supported and making parallel calls back into the server. You can initiate this process from the server, but since all the parallel tasks' results are returned in the client, it's unclear how to get them back to the server. You could make another server call and send the results, but I'm thinking the goal would be to get them back to the script that called HtmlService in the first place, unless you go with a beginRequest and endRequest type architecture.
tanaikech/RunAll
This is a library for running the concurrent processing using only native Google Apps Script (GAS). This library claims full support via a RunAll.Do(workers) method.
I'll update my answer if I find any other tricks.

Publish data from browser app without writing my own server

I need users to be able to post data from a single page browser application (SPA) to me, but I can't put server-side code on the host.
Is there a web service that I can use for this? I looked at Amazon SQS (simple queue service) but I can't call their REST APIs from within the browser due to cross origin policy.
I favour ease of development over robustness right now, so even just receiving an email would be fine. I'm not sure that the site is even going to catch on. If it does, then I'll develop a server-side component and move hosts.

Not only there are Web Services, but nowadays there are robust systems that provide a way to server-side some logic on your applications. They are called BaaS or Backend as a Service providers, usually to provide some backbone to your front end applications.
Although they have multiple uses, I'm going to list the most common in my opinion:
For mobile applications - Instead of having to learn an API for each device you code to, you can use an standard platform to store logic and data for your application.
For prototyping - If you want to create a slick application, but you don't want to code all the backend logic for the data -less dealing with all the operations and system administration that represents-, through a BaaS provider you only need good Front End skills to code the simplest CRUD applications you can imagine. Some BaaS even allow you to bind some Reduce algorithms to calls your perform to their API.
For web applications - When PaaS (Platform as a Service) came to town to ease the job for Backend End developers in order to avoid the hassle of System Administration and Operations, it was just logic that the same was going to happen to the Backend. There are many clones that showcase the real power of this strategy.
All of this is amazing, but I have yet to mention any of them. I'm going to list the ones that I know the most and have actually used in projects. There are probably many, but as far as I know, this one have satisfied most of my news, whether it's any of the previously ones mentioned.
Parse.com
Parse's most outstanding features target mobile devices; however, nowadays Parse contains an incredible amount of API's that allows you to use it as full feature backend service for Javascript, Android and even Windows 8 applications (Windows 8 SDK was introduced a few months ago this year).
How does a Parse code looks in Javascript?
Parse works through classes and objects (ain't that beautiful?), so you first create a specific class (can be done through Javascript, REST or even the Data Browser manager) and then you add objects to specific classes.
First, add up Parse as a script tag in javascript:
<script type="text/javascript" src="http://www.parsecdn.com/js/parse-1.1.15.min.js"></script>
Then, through a given Application ID and a Javascript Key, initialize Parse.
Parse.initialize("APPLICATION_ID", "JAVASCRIPT_KEY");
From there, it's all object manipulation
var Person = Parse.Object.extend("Person"); //Person is a class *cof* uppercase *cof*
var personObject = new Person();
personObject.save({name: "John"}, {
success: function(object) {
console.log("The object with the data "+ JSON.stringify(object) + " was saved successfully.");
},
error: function(model, error) {
console.log("There was an error! The following model and error object were provided by the Server");
console.log(model);
console.log(error);
}
});
What about authentication and security?
Parse has a User based authentication system, which pretty much allows you to store a base of users that can manipulate the data. If map the data with User information, you can ensure that only a given user can manipulate specific data. Plus, in the settings of your Parse application, you can specify that no clients are allowed to create classes, to ensure innecesary calls are performed.
Did you REALLY used in a web application?
Yes, it was my tool of choice for a medium fidelity prototype.
Firebase.com
Firebase's main feature is the ability to provide Real Time to your application without all the hassle. You don't need a MeteorJS server in order to bring Push Notifications to your software. If you know Javascript, you are half way through to bring Real Time magic to your users.
How does a Firebase looks in Javascript?
Firebase works in a REST fashion, and I think they do an amazing job structuring the Glory of REST. As a good example, look at the following Resource structure in Firebase:
https://SampleChat.firebaseIO-demo.com/users/fred/name/first
You don't need to be a rocket scientist to know that you are retrieve the first name of the user "Fred", giving there's at least one -usually there should be a UUID instead of a name, but hey, it's an example, give me a break-.
In order to start using Firebase, as with Parse, add up their CDN Javascript
<script type='text/javascript' src='https://cdn.firebase.com/v0/firebase.js'></script>
Now, create a reference object that will allow you to consume the Firebase API
var myRootRef = new Firebase('https://myprojectname.firebaseIO-demo.com/');
From there, you can create a bunch of neat applications.
var USERS_LOCATION = 'https://SampleChat.firebaseIO-demo.com/users';
var userId = "Fred"; // Username
var usersRef = new Firebase(USERS_LOCATION);
usersRef.child(userId).once('value', function(snapshot) {
var exists = (snapshot.val() !== null);
if (exists) {
console.log("Username "+userId+" is part of our database");
} else {
console.log("We have no register of the username "+userId);
}
});
What about authentication and security?
You are in luck! Firebase released their Security API about two weeks ago! I have yet to explore it, but I'm sure it fills most of the gaps that allowed random people to use your reference to their own purpose.
Did you REALLY used in a web application?
Eeehm... ok, no. I used it in a Chrome Extension! It's still in process but it's going to be a Real Time chat inside a Chrome Extension. Ain't that cool? Fine. I find it cool. Anyway, you can browse more awesome examples for Firebase in their examples page.
What's the magic of these services? If you read your Dependency Injection and Mock Object Testing, at some point you can completely replace all of those services for your own through a REST Web Service provider.
Since these services were created to be used inside any application, they are CORS ready. As stated before, I have successfully used both of them from multiple domains without any issue (I'm even trying to use Firebase in a Chrome Extension, and I'm sure I will succeed soon).
Both Parse and Firebase have Data Browser managers, which means that you can see the data you are manipulating through a simple web browser. As a final disclaimer, I have no relationship with any of those services other than the face that James Taplin (Firebase Co-founder) was amazing enough to lend me some Beta access to Firebase.

You actually CAN use SQS from the browser, even without CORS, as long as you only need the browser to send messages, not receive them. Warning: this is a kludge that would make my CS professors cry.
When you perform a GET request via javascript, the browser will always perform the request, however, you'll only get access to the response if it was from the same origin (protocol, host, port). This is your ticket to ride, since messages can be posted to an SQS queue with just a GET, and who really cares about the response anyways?
Assuming you're using jquery, your queue is https://sqs.us-east-1.amazonaws.com/71717171/myqueue, and allows anyone to post a message, the following will post a message with the body "HITHERE" to the queue:
$.ajax({
url: 'https://sqs.us-east-1.amazonaws.com/71717171/myqueue' +
'?Action=SendMessage' +
'&Version=2012-11-05' +
'&MessageBody=HITHERE'
})
The'll be an error in the console saying that the request failed, but the message will show up in the queue anyways.

Have you considered JSONP? That is one way of calling cross-domain scripts from javascript without running into the same origin policy. You're going to have to set up some script somewhere to send you the data, though. Javascript just isn't up to the task.

Depending in what kind of data you want to send, and what you're going to do with it, one way of solving it would be to post the data to a Google Spreadsheet using Ajax. It's a bit tricky to accomplish though.Here is another stackoverflow question about it.
If presentation isn't that important you can just have an embedded Google Spreadsheet Form.

What about mailto:youremail#goeshere.com ? ihihi
Meantime, you can turn on some free hostings like Altervista or Heroku or somenthing else like them .. so you can connect to their server , if i remember these free services allows servers p2p, so you can create a sort of personal web services and push ajax requests as well, obviously their servers are slow for free accounts, but i think it's enought if you do not have so much users traffic, else you should turn on some better VPS or Hosting or Cloud solution.

Maybe CouchDB can provide what you're after. IrisCouch provides free CouchDB instances. Lock it down so that users can't view documents and have a sensible validation function and you've got yourself an easy RESTful place to stick your data in.

Server-side Javascript in production fails to open connection to a named instance of SQL2008

I've got a production site that has been working for years with a SQL Server 2000 default instance on server named MDWDATA. TCP port 1433 and Named Pipes are enabled there. My goal is to get this web app working with a copy of the database upgraded to SQL Server 2008. I've installed SQL2008 with SP1 on a server called DEVMOJITO and tested the new database using various VB6 desktop programs that exercise various stored procs in a client-server fashion and parts of the website itself work fine against the upgraded database residing on this named instance of SQL2008. So, while I am happy that the database upgrade seems fine there is a part of this website that fails with this Named Pipes Provider: Could not open a connection to SQL Server [1231]. I think this error is misleading. I disabled Named Pipes on the SQL2000 instance used by the production site, restarted SQL and all the ASP code still continued to work fine (plus we have a firewall between both database servers and these web virtual directories on a public facing webserver.
URL to my production virtual directory which demos the working page:
URL to my development v-directory which demos the failing page:
All the code is the same on both prod and dev sites except that on dev I'm trying to connect to the upgraded database.
I know there are dozens of things to check which I've been searching for but here are a few things I can offer to help you help me:
The code that is failing is server-side Javascript adapted from Brent Ashley's "Javascript Remote Scripting (JSRS)" code package years ago. It operates in an AJAX-like manner by posting requests back to different ASP pages and then handling a callback. I think the key thing to point out here is how I changed the connection to the database: (I cannot get Javascript to format right here!)
function setDBConnect(datasource)
{
var strConnect; //ADO connection string
//strConnect = "DRIVER=SQL Server;SERVER=MDWDATA;UID=uname;PASSWORD=x; DATABASE=StagingMDS;";
strConnect = "Provider=SQLNCLI10;Server=DEVMOJITO\MSSQLSERVER2008;Uid=uname;Pwd=x;DATABASE=StagingMDS;";
return strConnect;
}
function serializeSql( sql , datasource)
{
var conn = new ActiveXObject("ADODB.Connection");
var ConnectString = setDBConnect(datasource);
conn.Open( ConnectString );
var rs = conn.Execute( sql );
Please note how the connection string differs. I think that could be the problem but I don't know what to do. I am surprised the error returned says "named pipes" was involved because I really wanted to use TCP. The connection string syntax here is the same as used successfully on a different part of the site which uses VBScript which I'll paste here to show:
if DataBaseConnectionsAreNeeded(strScriptName) then
dim strWebDB
Set objConn = Server.CreateObject("ADODB.Connection")
if IsProductionWeb() Then
strWebDB = "DATABASE=MDS;SERVER=MDWDATA;DRIVER=SQL Server;UID=uname;PASSWORD=x;"
end if
if IsDevelopmentWeb() Then
strWebDB = "Provider=SQLNCLI10;Server=DEVMOJITO\MSSQLSERVER2008;Database=StagingMDS;UID=uname;PASSWORD=x;"
end if
objConn.ConnectionString = strWebDB
objConn.ConnectionTimeout = 30
objConn.Open
set oCmd = Server.CreateObject("ADODB.Command")
oCmd.ActiveConnection = objConn
This code works in both prod and dev virtual directories and other code in other parts of the web which use ASP.NET work against both databases correctly. Named pipes and TCP are both enabled on each server. I don't understand the string used by the Pipes but I am using the defaults always.
I wonder why the Javascript call above results in use of named pipes instead of TCP. Any ideas would be greatly appreciated.

Summary of what I did to get this working:
Add an extra slash to the connection string since this is server-side Javascript:
Server=tcp:DEVMOJITO\MSSQLSERVER2008,1219;
Explicitly code tcp: as a protocol prefix and port 1219. I learned that by default a named instance of SQL uses dynamic porting. I ended up turning that off and chose, somewhat arbitrarily, the port 1219, which dynamic had chosen before I turned it off. There are probably other ways to get this part working.
Finally, I discovered that SET NOCOUNT ON needed to be added to the stored procedure being called. Otherwise, the symptom is the message: "Operation is not allowed when the object is closed".

Develop Reference

JavaScript is the programming language of the Web.