Why does Google prepend while(1); to their (private) JSON responses?
For example, here's a response while turning a calendar on and off in Google Calendar:
while (1);
[
['u', [
['smsSentFlag', 'false'],
['hideInvitations', 'false'],
['remindOnRespondedEventsOnly', 'true'],
['hideInvitations_remindOnRespondedEventsOnly', 'false_true'],
['Calendar ID stripped for privacy', 'false'],
['smsVerifiedFlag', 'true']
]]
]
I would assume this is to prevent people from doing an eval() on it, but all you'd really have to do is replace the while and then you'd be set. I would assume the eval prevention is to make sure people write safe JSON parsing code.
I've seen this used in a couple of other places, too, but a lot more so with Google (Mail, Calendar, Contacts, etc.) Strangely enough, Google Docs starts with &&&START&&& instead, and Google Contacts seems to start with while(1); &&&START&&&.
What's going on here?
It prevents JSON hijacking, a major JSON security issue that is formally fixed in all major browsers since 2011 with ECMAScript 5.
Contrived example: say Google has a URL like mail.google.com/json?action=inbox which returns the first 50 messages of your inbox in JSON format. Evil websites on other domains can't make AJAX requests to get this data due to the same-origin policy, but they can include the URL via a <script> tag. The URL is visited with your cookies, and by overriding the global array constructor or accessor methods they can have a method called whenever an object (array or hash) attribute is set, allowing them to read the JSON content.
The while(1); or &&&BLAH&&& prevents this: an AJAX request at mail.google.com will have full access to the text content, and can strip it away. But a <script> tag insertion blindly executes the JavaScript without any processing, resulting in either an infinite loop or a syntax error.
This does not address the issue of cross-site request forgery.
It prevents disclosure of the response through JSON hijacking.
In theory, the content of HTTP responses is protected by the Same Origin Policy: pages from one domain cannot get any pieces of information from pages on the other domain (unless explicitly allowed).
An attacker can request pages on other domains on your behalf, e.g. by using a <script src=...> or <img> tag, but it can't get any information about the result (headers, contents).
Thus, if you visit an attacker's page, it couldn't read your email from gmail.com.
Except that when using a script tag to request JSON content, the JSON is executed as JavaScript in an attacker's controlled environment. If the attacker can replace the Array or Object constructor or some other method used during object construction, anything in the JSON would pass through the attacker's code, and be disclosed.
Note that this happens when the JSON is executed as JavaScript, not when it's parsed.
There are multiple countermeasures:
Making sure the JSON never executes
By placing a while(1); statement before the JSON data, Google ensures that the JSON data is never executed as JavaScript.
Only a legitimate page could actually get the whole content, strip the while(1);, and parse the remainder as JSON.
Things like for(;;); have been seen on Facebook for instance, with the same results.
Making sure the JSON is not valid JavaScript
Similarly, adding invalid tokens before the JSON, like &&&START&&&, makes sure that it is never executed.
Always return JSON with an Object on the outside
This is OWASP recommended way to protect from JSON hijacking and is the less intrusive one.
Similarly to the previous counter-measures, it makes sure that the JSON is never executed as JavaScript.
A valid JSON object, when not enclosed by anything, is not valid in JavaScript, since the { } gets interpreted as a code block:
eval('{"foo":"bar"}')
// SyntaxError: Unexpected token :
This is however valid JSON:
JSON.parse('{"foo":"bar"}')
// Object {foo: "bar"}
So, make sure you always return an Object at the top level of the response and make sure that the JSON is not valid JavaScript, while still being valid JSON.
As noted by #hvd in the comments, the empty object {} is valid JavaScript, and knowing the object is empty may itself be valuable information.
Comparison of the above methods
The OWASP way is less intrusive, as it needs no client library changes, and transfers valid JSON. It is unsure whether past or future browser bugs could defeat this, however. As noted by #oriadam, it is unclear whether data could be leaked in a parse error through an error handling or not (e.g. window.onerror).
Google's way requires a client library in order for it to support automatic de-serialization and can be considered to be safer with regard to browser bugs.
Both methods require server-side changes in order to avoid developers accidentally sending vulnerable JSON.
This is to ensure some other site can't do nasty tricks to try to steal your data. For example, by replacing the array constructor, then including this JSON URL via a <script> tag, a malicious third-party site could steal the data from the JSON response. By putting a while(1); at the start, the script will hang instead.
A same-site request using XHR and a separate JSON parser, on the other hand, can easily ignore the while(1); prefix.
That would be to make it difficult for a third-party to insert the JSON response into an HTML document with the <script> tag. Remember that the <script> tag is exempt from the Same Origin Policy.
Note: as of 2019, many of the old vulnerabilities that lead to the preventative measures discussed in this question are no longer an issue in modern browsers. I'll leave the answer below as a historical curiosity, but really the whole topic has changed radically since 2010 (!!) when this was asked.
It prevents it from being used as the target of a simple <script> tag. (Well, it doesn't prevent it, but it makes it unpleasant.) That way bad guys can't just put that script tag in their own site and rely on an active session to make it possible to fetch your content.
edit — note the comment (and other answers). The issue has to do with subverted built-in facilities, specifically the Object and Array constructors. Those can be altered such that otherwise innocuous JSON, when parsed, could trigger attacker code.
Since the <script> tag is exempted from the Same Origin Policy which is a security necessity in the web world, while(1) when added to the JSON response prevents misuse of it in the <script> tag.
As this is a High traffic post I hope to provide here an answer slightly more undetermined to the original question and thus provide further background on a JSON Hijacking attack and its consequences
JSON Hijacking as the name suggests is an attack similar to Cross-Site Request Forgery where an attacker can access cross-domain sensitive JSON data from applications that return sensitive data as array literals to GET requests. An example of a JSON call returning an array literal is shown below:
[{"id":"1001","ccnum":"4111111111111111","balance":"2345.15"},
{"id":"1002","ccnum":"5555555555554444","balance":"10345.00"},
{"id":"1003","ccnum":"5105105105105100","balance":"6250.50"}]
This attack can be achieved in 3 major steps:
Step 1: Get an authenticated user to visit a malicious page.
Step 2: The malicious page will try and access sensitive data from the application that the user is logged into. This can be done by embedding a script tag in an HTML page since the same-origin policy does not apply to script tags.
<script src="http://<jsonsite>/json_server.php"></script>
The browser will make a GET request to json_server.php and any authentication cookies of the user will be sent along with the request.
Step 3: At this point, while the malicious site has executed the script it does not have access to any sensitive data. Getting access to the data can be achieved by using an object prototype setter. In the code below an object prototypes property is being bound to the defined function when an attempt is being made to set the "ccnum" property.
Object.prototype.__defineSetter__('ccnum',function(obj){
secrets =secrets.concat(" ", obj);
});
At this point, the malicious site has successfully hijacked the sensitive financial data (ccnum) returned byjson_server.php
JSON
It should be noted that not all browsers support this method; the proof of concept was done on Firefox 3.x.This method has now been deprecated and replaced by the useObject.defineProperty There is also a variation of this attack that should work on all browsers where full-named JavaScript (e.g. pi=3.14159) is returned instead of a JSON array.
There are several ways in which JSON Hijacking can be prevented:
Since SCRIPT tags can only generate HTTP GET requests, they only return JSON objects to POST
requests.
Prevent the web browser from interpreting the JSON object as valid JavaScript code.
Implement Cross-Site Request Forgery protection by requiring that a predefined random value be required for all JSON requests.
so as you can see While(1) comes under the last option. In the most simple terms, while(1) is an infinite loop that will run till a break statement is issued explicitly. And thus what would be described as a lock for the key to be applied (google break statement). Therefore a JSON hijacking, in which the Hacker has no key will be consistently dismissed. Alas, If you read the JSON block with a parser, the while(1) loop is ignored.
So in conclusion, the while(1) loop can more easily be visualized as a simple break statement cypher that google can use to control the flow of data.
However, the keyword in that statement is the word 'simple'. The usage of authenticated infinite loops has been thankfully removed from basic practice in the years since 2010 due to its absolute decimation of CPU usage when isolated (and the fact the internet has moved away from forcing through crude 'quick-fixes'). Today instead the codebase has embedded preventative measures, and the system is not crucial or effective anymore. (part of this is the move away from JSON Hijacking to more fruitful data farming techniques that I won't go into at present)
After authentication is in place, JSON hijacking protection can take a
variety of forms. Google appends while(1) into their JSON data, so
that if any malicious script evaluates it, the malicious script enters
an infinite loop.
Reference: Web Security Testing Cookbook: Systematic Techniques to Find Problems Fast
Related
Why does Google prepend while(1); to their (private) JSON responses?
For example, here's a response while turning a calendar on and off in Google Calendar:
while (1);
[
['u', [
['smsSentFlag', 'false'],
['hideInvitations', 'false'],
['remindOnRespondedEventsOnly', 'true'],
['hideInvitations_remindOnRespondedEventsOnly', 'false_true'],
['Calendar ID stripped for privacy', 'false'],
['smsVerifiedFlag', 'true']
]]
]
I would assume this is to prevent people from doing an eval() on it, but all you'd really have to do is replace the while and then you'd be set. I would assume the eval prevention is to make sure people write safe JSON parsing code.
I've seen this used in a couple of other places, too, but a lot more so with Google (Mail, Calendar, Contacts, etc.) Strangely enough, Google Docs starts with &&&START&&& instead, and Google Contacts seems to start with while(1); &&&START&&&.
What's going on here?
It prevents JSON hijacking, a major JSON security issue that is formally fixed in all major browsers since 2011 with ECMAScript 5.
Contrived example: say Google has a URL like mail.google.com/json?action=inbox which returns the first 50 messages of your inbox in JSON format. Evil websites on other domains can't make AJAX requests to get this data due to the same-origin policy, but they can include the URL via a <script> tag. The URL is visited with your cookies, and by overriding the global array constructor or accessor methods they can have a method called whenever an object (array or hash) attribute is set, allowing them to read the JSON content.
The while(1); or &&&BLAH&&& prevents this: an AJAX request at mail.google.com will have full access to the text content, and can strip it away. But a <script> tag insertion blindly executes the JavaScript without any processing, resulting in either an infinite loop or a syntax error.
This does not address the issue of cross-site request forgery.
It prevents disclosure of the response through JSON hijacking.
In theory, the content of HTTP responses is protected by the Same Origin Policy: pages from one domain cannot get any pieces of information from pages on the other domain (unless explicitly allowed).
An attacker can request pages on other domains on your behalf, e.g. by using a <script src=...> or <img> tag, but it can't get any information about the result (headers, contents).
Thus, if you visit an attacker's page, it couldn't read your email from gmail.com.
Except that when using a script tag to request JSON content, the JSON is executed as JavaScript in an attacker's controlled environment. If the attacker can replace the Array or Object constructor or some other method used during object construction, anything in the JSON would pass through the attacker's code, and be disclosed.
Note that this happens when the JSON is executed as JavaScript, not when it's parsed.
There are multiple countermeasures:
Making sure the JSON never executes
By placing a while(1); statement before the JSON data, Google ensures that the JSON data is never executed as JavaScript.
Only a legitimate page could actually get the whole content, strip the while(1);, and parse the remainder as JSON.
Things like for(;;); have been seen on Facebook for instance, with the same results.
Making sure the JSON is not valid JavaScript
Similarly, adding invalid tokens before the JSON, like &&&START&&&, makes sure that it is never executed.
Always return JSON with an Object on the outside
This is OWASP recommended way to protect from JSON hijacking and is the less intrusive one.
Similarly to the previous counter-measures, it makes sure that the JSON is never executed as JavaScript.
A valid JSON object, when not enclosed by anything, is not valid in JavaScript, since the { } gets interpreted as a code block:
eval('{"foo":"bar"}')
// SyntaxError: Unexpected token :
This is however valid JSON:
JSON.parse('{"foo":"bar"}')
// Object {foo: "bar"}
So, make sure you always return an Object at the top level of the response and make sure that the JSON is not valid JavaScript, while still being valid JSON.
As noted by #hvd in the comments, the empty object {} is valid JavaScript, and knowing the object is empty may itself be valuable information.
Comparison of the above methods
The OWASP way is less intrusive, as it needs no client library changes, and transfers valid JSON. It is unsure whether past or future browser bugs could defeat this, however. As noted by #oriadam, it is unclear whether data could be leaked in a parse error through an error handling or not (e.g. window.onerror).
Google's way requires a client library in order for it to support automatic de-serialization and can be considered to be safer with regard to browser bugs.
Both methods require server-side changes in order to avoid developers accidentally sending vulnerable JSON.
This is to ensure some other site can't do nasty tricks to try to steal your data. For example, by replacing the array constructor, then including this JSON URL via a <script> tag, a malicious third-party site could steal the data from the JSON response. By putting a while(1); at the start, the script will hang instead.
A same-site request using XHR and a separate JSON parser, on the other hand, can easily ignore the while(1); prefix.
That would be to make it difficult for a third-party to insert the JSON response into an HTML document with the <script> tag. Remember that the <script> tag is exempt from the Same Origin Policy.
Note: as of 2019, many of the old vulnerabilities that lead to the preventative measures discussed in this question are no longer an issue in modern browsers. I'll leave the answer below as a historical curiosity, but really the whole topic has changed radically since 2010 (!!) when this was asked.
It prevents it from being used as the target of a simple <script> tag. (Well, it doesn't prevent it, but it makes it unpleasant.) That way bad guys can't just put that script tag in their own site and rely on an active session to make it possible to fetch your content.
edit — note the comment (and other answers). The issue has to do with subverted built-in facilities, specifically the Object and Array constructors. Those can be altered such that otherwise innocuous JSON, when parsed, could trigger attacker code.
Since the <script> tag is exempted from the Same Origin Policy which is a security necessity in the web world, while(1) when added to the JSON response prevents misuse of it in the <script> tag.
As this is a High traffic post I hope to provide here an answer slightly more undetermined to the original question and thus provide further background on a JSON Hijacking attack and its consequences
JSON Hijacking as the name suggests is an attack similar to Cross-Site Request Forgery where an attacker can access cross-domain sensitive JSON data from applications that return sensitive data as array literals to GET requests. An example of a JSON call returning an array literal is shown below:
[{"id":"1001","ccnum":"4111111111111111","balance":"2345.15"},
{"id":"1002","ccnum":"5555555555554444","balance":"10345.00"},
{"id":"1003","ccnum":"5105105105105100","balance":"6250.50"}]
This attack can be achieved in 3 major steps:
Step 1: Get an authenticated user to visit a malicious page.
Step 2: The malicious page will try and access sensitive data from the application that the user is logged into. This can be done by embedding a script tag in an HTML page since the same-origin policy does not apply to script tags.
<script src="http://<jsonsite>/json_server.php"></script>
The browser will make a GET request to json_server.php and any authentication cookies of the user will be sent along with the request.
Step 3: At this point, while the malicious site has executed the script it does not have access to any sensitive data. Getting access to the data can be achieved by using an object prototype setter. In the code below an object prototypes property is being bound to the defined function when an attempt is being made to set the "ccnum" property.
Object.prototype.__defineSetter__('ccnum',function(obj){
secrets =secrets.concat(" ", obj);
});
At this point, the malicious site has successfully hijacked the sensitive financial data (ccnum) returned byjson_server.php
JSON
It should be noted that not all browsers support this method; the proof of concept was done on Firefox 3.x.This method has now been deprecated and replaced by the useObject.defineProperty There is also a variation of this attack that should work on all browsers where full-named JavaScript (e.g. pi=3.14159) is returned instead of a JSON array.
There are several ways in which JSON Hijacking can be prevented:
Since SCRIPT tags can only generate HTTP GET requests, they only return JSON objects to POST
requests.
Prevent the web browser from interpreting the JSON object as valid JavaScript code.
Implement Cross-Site Request Forgery protection by requiring that a predefined random value be required for all JSON requests.
so as you can see While(1) comes under the last option. In the most simple terms, while(1) is an infinite loop that will run till a break statement is issued explicitly. And thus what would be described as a lock for the key to be applied (google break statement). Therefore a JSON hijacking, in which the Hacker has no key will be consistently dismissed. Alas, If you read the JSON block with a parser, the while(1) loop is ignored.
So in conclusion, the while(1) loop can more easily be visualized as a simple break statement cypher that google can use to control the flow of data.
However, the keyword in that statement is the word 'simple'. The usage of authenticated infinite loops has been thankfully removed from basic practice in the years since 2010 due to its absolute decimation of CPU usage when isolated (and the fact the internet has moved away from forcing through crude 'quick-fixes'). Today instead the codebase has embedded preventative measures, and the system is not crucial or effective anymore. (part of this is the move away from JSON Hijacking to more fruitful data farming techniques that I won't go into at present)
After authentication is in place, JSON hijacking protection can take a
variety of forms. Google appends while(1) into their JSON data, so
that if any malicious script evaluates it, the malicious script enters
an infinite loop.
Reference: Web Security Testing Cookbook: Systematic Techniques to Find Problems Fast
Why does Google prepend while(1); to their (private) JSON responses?
For example, here's a response while turning a calendar on and off in Google Calendar:
while (1);
[
['u', [
['smsSentFlag', 'false'],
['hideInvitations', 'false'],
['remindOnRespondedEventsOnly', 'true'],
['hideInvitations_remindOnRespondedEventsOnly', 'false_true'],
['Calendar ID stripped for privacy', 'false'],
['smsVerifiedFlag', 'true']
]]
]
I would assume this is to prevent people from doing an eval() on it, but all you'd really have to do is replace the while and then you'd be set. I would assume the eval prevention is to make sure people write safe JSON parsing code.
I've seen this used in a couple of other places, too, but a lot more so with Google (Mail, Calendar, Contacts, etc.) Strangely enough, Google Docs starts with &&&START&&& instead, and Google Contacts seems to start with while(1); &&&START&&&.
What's going on here?
It prevents JSON hijacking, a major JSON security issue that is formally fixed in all major browsers since 2011 with ECMAScript 5.
Contrived example: say Google has a URL like mail.google.com/json?action=inbox which returns the first 50 messages of your inbox in JSON format. Evil websites on other domains can't make AJAX requests to get this data due to the same-origin policy, but they can include the URL via a <script> tag. The URL is visited with your cookies, and by overriding the global array constructor or accessor methods they can have a method called whenever an object (array or hash) attribute is set, allowing them to read the JSON content.
The while(1); or &&&BLAH&&& prevents this: an AJAX request at mail.google.com will have full access to the text content, and can strip it away. But a <script> tag insertion blindly executes the JavaScript without any processing, resulting in either an infinite loop or a syntax error.
This does not address the issue of cross-site request forgery.
It prevents disclosure of the response through JSON hijacking.
In theory, the content of HTTP responses is protected by the Same Origin Policy: pages from one domain cannot get any pieces of information from pages on the other domain (unless explicitly allowed).
An attacker can request pages on other domains on your behalf, e.g. by using a <script src=...> or <img> tag, but it can't get any information about the result (headers, contents).
Thus, if you visit an attacker's page, it couldn't read your email from gmail.com.
Except that when using a script tag to request JSON content, the JSON is executed as JavaScript in an attacker's controlled environment. If the attacker can replace the Array or Object constructor or some other method used during object construction, anything in the JSON would pass through the attacker's code, and be disclosed.
Note that this happens when the JSON is executed as JavaScript, not when it's parsed.
There are multiple countermeasures:
Making sure the JSON never executes
By placing a while(1); statement before the JSON data, Google ensures that the JSON data is never executed as JavaScript.
Only a legitimate page could actually get the whole content, strip the while(1);, and parse the remainder as JSON.
Things like for(;;); have been seen on Facebook for instance, with the same results.
Making sure the JSON is not valid JavaScript
Similarly, adding invalid tokens before the JSON, like &&&START&&&, makes sure that it is never executed.
Always return JSON with an Object on the outside
This is OWASP recommended way to protect from JSON hijacking and is the less intrusive one.
Similarly to the previous counter-measures, it makes sure that the JSON is never executed as JavaScript.
A valid JSON object, when not enclosed by anything, is not valid in JavaScript, since the { } gets interpreted as a code block:
eval('{"foo":"bar"}')
// SyntaxError: Unexpected token :
This is however valid JSON:
JSON.parse('{"foo":"bar"}')
// Object {foo: "bar"}
So, make sure you always return an Object at the top level of the response and make sure that the JSON is not valid JavaScript, while still being valid JSON.
As noted by #hvd in the comments, the empty object {} is valid JavaScript, and knowing the object is empty may itself be valuable information.
Comparison of the above methods
The OWASP way is less intrusive, as it needs no client library changes, and transfers valid JSON. It is unsure whether past or future browser bugs could defeat this, however. As noted by #oriadam, it is unclear whether data could be leaked in a parse error through an error handling or not (e.g. window.onerror).
Google's way requires a client library in order for it to support automatic de-serialization and can be considered to be safer with regard to browser bugs.
Both methods require server-side changes in order to avoid developers accidentally sending vulnerable JSON.
This is to ensure some other site can't do nasty tricks to try to steal your data. For example, by replacing the array constructor, then including this JSON URL via a <script> tag, a malicious third-party site could steal the data from the JSON response. By putting a while(1); at the start, the script will hang instead.
A same-site request using XHR and a separate JSON parser, on the other hand, can easily ignore the while(1); prefix.
That would be to make it difficult for a third-party to insert the JSON response into an HTML document with the <script> tag. Remember that the <script> tag is exempt from the Same Origin Policy.
Note: as of 2019, many of the old vulnerabilities that lead to the preventative measures discussed in this question are no longer an issue in modern browsers. I'll leave the answer below as a historical curiosity, but really the whole topic has changed radically since 2010 (!!) when this was asked.
It prevents it from being used as the target of a simple <script> tag. (Well, it doesn't prevent it, but it makes it unpleasant.) That way bad guys can't just put that script tag in their own site and rely on an active session to make it possible to fetch your content.
edit — note the comment (and other answers). The issue has to do with subverted built-in facilities, specifically the Object and Array constructors. Those can be altered such that otherwise innocuous JSON, when parsed, could trigger attacker code.
Since the <script> tag is exempted from the Same Origin Policy which is a security necessity in the web world, while(1) when added to the JSON response prevents misuse of it in the <script> tag.
As this is a High traffic post I hope to provide here an answer slightly more undetermined to the original question and thus provide further background on a JSON Hijacking attack and its consequences
JSON Hijacking as the name suggests is an attack similar to Cross-Site Request Forgery where an attacker can access cross-domain sensitive JSON data from applications that return sensitive data as array literals to GET requests. An example of a JSON call returning an array literal is shown below:
[{"id":"1001","ccnum":"4111111111111111","balance":"2345.15"},
{"id":"1002","ccnum":"5555555555554444","balance":"10345.00"},
{"id":"1003","ccnum":"5105105105105100","balance":"6250.50"}]
This attack can be achieved in 3 major steps:
Step 1: Get an authenticated user to visit a malicious page.
Step 2: The malicious page will try and access sensitive data from the application that the user is logged into. This can be done by embedding a script tag in an HTML page since the same-origin policy does not apply to script tags.
<script src="http://<jsonsite>/json_server.php"></script>
The browser will make a GET request to json_server.php and any authentication cookies of the user will be sent along with the request.
Step 3: At this point, while the malicious site has executed the script it does not have access to any sensitive data. Getting access to the data can be achieved by using an object prototype setter. In the code below an object prototypes property is being bound to the defined function when an attempt is being made to set the "ccnum" property.
Object.prototype.__defineSetter__('ccnum',function(obj){
secrets =secrets.concat(" ", obj);
});
At this point, the malicious site has successfully hijacked the sensitive financial data (ccnum) returned byjson_server.php
JSON
It should be noted that not all browsers support this method; the proof of concept was done on Firefox 3.x.This method has now been deprecated and replaced by the useObject.defineProperty There is also a variation of this attack that should work on all browsers where full-named JavaScript (e.g. pi=3.14159) is returned instead of a JSON array.
There are several ways in which JSON Hijacking can be prevented:
Since SCRIPT tags can only generate HTTP GET requests, they only return JSON objects to POST
requests.
Prevent the web browser from interpreting the JSON object as valid JavaScript code.
Implement Cross-Site Request Forgery protection by requiring that a predefined random value be required for all JSON requests.
so as you can see While(1) comes under the last option. In the most simple terms, while(1) is an infinite loop that will run till a break statement is issued explicitly. And thus what would be described as a lock for the key to be applied (google break statement). Therefore a JSON hijacking, in which the Hacker has no key will be consistently dismissed. Alas, If you read the JSON block with a parser, the while(1) loop is ignored.
So in conclusion, the while(1) loop can more easily be visualized as a simple break statement cypher that google can use to control the flow of data.
However, the keyword in that statement is the word 'simple'. The usage of authenticated infinite loops has been thankfully removed from basic practice in the years since 2010 due to its absolute decimation of CPU usage when isolated (and the fact the internet has moved away from forcing through crude 'quick-fixes'). Today instead the codebase has embedded preventative measures, and the system is not crucial or effective anymore. (part of this is the move away from JSON Hijacking to more fruitful data farming techniques that I won't go into at present)
After authentication is in place, JSON hijacking protection can take a
variety of forms. Google appends while(1) into their JSON data, so
that if any malicious script evaluates it, the malicious script enters
an infinite loop.
Reference: Web Security Testing Cookbook: Systematic Techniques to Find Problems Fast
Ok, so I'm developing a web app that has begun to be more ajaxified. I then read a blog that talked about javascript hijacking, and I'm a little confused about when it's actually a problem. I want some clarification
Question 1:
Is this the problem/vulnerability?
If my site returns json data with a 'GET' request that has sensitive
information then that information can get into the wrong hands.
I use ASP.NET MVC and the method that returns JSON requires you to explicitly allow json get requests. I'm guessing that they are trying to save the uninitiated from this security vulnerability.
Question 2:
Does the hijacking occur by sniffing/reading the response as it's being sent through the internet? Does SSL mitigate that attack?
Question 3:
This led me to ask this question to myself. If I'm storing page state in local javascript object(s) of the page, can someone hijack that data(other than the logged in user)?
Question 4:
Can I safely mitigate against THIS vulnerability by only returning JSON with a 'POST' request?
The post you linked to is talking about CSRF & XSS (see my comment on the question), so in that context:
Is this the problem/vulnerabiliy ("If my site returns json data with a 'GET' request that has sensitive information then that information can get into the wrong hands.")?
No.
Does the hijacking occur by sniffing/reading the response as it's being sent through the internet?
No.
If I'm storing page state in local javascript object(s) of the page, can someone hijack that data(other than the logged in user)?
It depends. It depends on whether you're storing the data in cookies and haven't set the right domain, or path. It depends on whether there's a security vulnerability on the client browser that would allow a script to gain access to data that typically is restricted. There are numerous other vectors of attack, and new ones are discovered all the time. The long and the short of it is: don't trust the browser with any confidential or secure data.
Can I safely mitigate against THIS vulnerability by only returning JSON with a 'POST' request?
No (it's not a single vulnerability, it's a set of classes of vulnerabilities).
Well you can check if there was a get and if the get was from a correct referrer.
You are not really much safer getting it from a POST because that is just as easy to simulate.
In general there are a lot of things you can do to prevent cross site forgery and manipulation.
The actually vulnerability is being able to overwrite Array.
If one overwrites the native Array then one get's access to the JSON data that's constructed as an Array.
This vulnerability has been patched in all major browsers.
You should only worry about this if your clients are using insecure browsers.
Example:
window.Array = function() {
console.log(arguments);
// send to secret server
}
...
$.get(url, function(data) { ... });
When the data is constructed if there are any arrays in the returned JSON the browser will call window.Array and then that data in that array gets send to the secret server.
I have a JSON web service to return home markers to be displayed on my Google Map.
Essentially, http://example.com calls the web service to find out the location of all map markers to display like so:
http://example.com/json/?zipcode=12345
And it returns a JSON string such as:
{"address": "321 Main St, Mountain View, CA, USA", ...}
So on my index.html page, I take that JSON string and place the map markers.
However, what I don't want to have happen is people calling out to my JSON web service directly.
I only want http://example.com/index.html to be able to call my http://example.com/json/ web service ... and not some random dude calling the /json/ directly.
Quesiton: how do I prevent direct calling/access to my http://example.com/json/ web service?
UPDATE:
To give more clarity, http://example.com/index.html call http://example.com/json/?zipcode=12345 ... and the JSON service
- returns semi-sensitive data,
- returns a JSON array,
- responds to GET requests,
- the browser making the request has JavaScript enabled
Again, what I don't want to have happen is people simply look at my index.html source code and then call the JSON service directly.
There are a few good ways to authenticate clients.
By IP address. In Apache, use the Allow / Deny directives.
By HTTP auth: basic or digest. This is nice and standardized, and uses usernames/passwords to authenticate.
By cookie. You'll have to come up with the cookie.
By a custom HTTP header that you invent.
Edit:
I didn't catch at first that your web service is being called by client-side code. It is literally NOT POSSIBLE to prevent people from calling your web service directly, if you let client-side Javascript do it. Someone could just read the source code.
Some more specific answers here, but I'd like to make the following general point:
Anything done over AJAX is being loaded by the user's browser. You could make a hacker's life hard if you wanted to, but, ultimately, there is no way of stopping me from getting data that you already freely make available to me. Any service that is publicly available is publicly available, plain and simple.
If you are using Apache you can set allow/deny on locations.
http://www.apachesecurity.net/
or here is a link to the apache docs on the Deny directive
http://httpd.apache.org/docs/2.0/mod/mod_access.html#deny
EDITS (responding to the new info).
The Deny directive also works with environment variables. You can restrict access based on browser string (not really secure, but discourages casual browsing) which would still allow XHR calls.
I would suggest the best way to accomplish this is to have a token of some kind that validates the request is a 'good' request. You can do that with a cookie, a session store of some kind, or a parameter (or some combination).
What I would suggest for something like this is to generate a unique url for the service that expires after a short period of time. You could do something like this pretty easily with Memcache. This strategy could also be used to obfuscate the service url (which would not provide any actual security, but would raise the bar for someone wanting to make direct calls).
Lastly, you could also use public key crypto to do this, but that would be very heavy. You would need to generate a new pub/priv key pair for each request and return the pubkey to the js client (here is a link to an implementation in javascript) http://www.cs.pitt.edu/~kirk/cs1501/notes/rsademo/
You can add a random number as a flag to determine whether the request are coming from the page just sent:
1) When generates index.html, add a random number to the JSON request URL:
Old: http://example.com/json/?zipcode=12345
New: http://example.com/json/?zipcode=12345&f=234234234234234234
Add this number to the Session Context as well.
2) The client browser renders the index.html and request JSON data by the new URL.
3) Your server gets the json request and checks the flag number with Session Context. If matched, response data. Otherwise, return an error message.
4) Clear Session Context by the end of response, or timeout triggered.
Accept only POST requests to the JSON-yielding URL. That won't prevent determined people from getting to it, but it will prevent casual browsing.
I know this is old but for anyone getting here later this is the easiest way to do this. You need to protect the AJAX subpage with a password that you can set on the container page before calling the include.
The easiest way to do this is to require HTTPS on the AJAX call and pass a POST variable. HTTPS + POST ensures the password is always encrypted.
So on the AJAX/sub-page do something like
if ($_POST["access"] == "makeupapassword")
{
...
}
else
{
echo "You can't access this directly";
}
When you call the AJAX make sure to include the POST variable and password in your payload. Since it is in POST it will be encrypted, and since it is random (hopefully) nobody will be able to guess it.
If you want to include or require the PHP directly on another page, just set the POST variable to the password before including it.
$_POST["access"] = "makeupapassword";
require("path/to/the/ajax/file.php");
This is a lot better than maintaining a global variable, session variable, or cookie because some of those are persistent across page loads so you have to make sure to reset the state after checking so users can't get accidental access.
Also I think it is better than page headers because it can't be sniffed since it is secured by HHTPS.
You'll probably have to have some kind of cookie-based authentication. In addition, Ignacio has a good point about using POST. This can help prevent JSON hijacking if you have untrusted scripts running on your domain. However, I don't think using POST is strictly necessary unless the outermost JSON type is an array. In your example it is an object.
Why does Google prepend while(1); to their (private) JSON responses?
For example, here's a response while turning a calendar on and off in Google Calendar:
while (1);
[
['u', [
['smsSentFlag', 'false'],
['hideInvitations', 'false'],
['remindOnRespondedEventsOnly', 'true'],
['hideInvitations_remindOnRespondedEventsOnly', 'false_true'],
['Calendar ID stripped for privacy', 'false'],
['smsVerifiedFlag', 'true']
]]
]
I would assume this is to prevent people from doing an eval() on it, but all you'd really have to do is replace the while and then you'd be set. I would assume the eval prevention is to make sure people write safe JSON parsing code.
I've seen this used in a couple of other places, too, but a lot more so with Google (Mail, Calendar, Contacts, etc.) Strangely enough, Google Docs starts with &&&START&&& instead, and Google Contacts seems to start with while(1); &&&START&&&.
What's going on here?
It prevents JSON hijacking, a major JSON security issue that is formally fixed in all major browsers since 2011 with ECMAScript 5.
Contrived example: say Google has a URL like mail.google.com/json?action=inbox which returns the first 50 messages of your inbox in JSON format. Evil websites on other domains can't make AJAX requests to get this data due to the same-origin policy, but they can include the URL via a <script> tag. The URL is visited with your cookies, and by overriding the global array constructor or accessor methods they can have a method called whenever an object (array or hash) attribute is set, allowing them to read the JSON content.
The while(1); or &&&BLAH&&& prevents this: an AJAX request at mail.google.com will have full access to the text content, and can strip it away. But a <script> tag insertion blindly executes the JavaScript without any processing, resulting in either an infinite loop or a syntax error.
This does not address the issue of cross-site request forgery.
It prevents disclosure of the response through JSON hijacking.
In theory, the content of HTTP responses is protected by the Same Origin Policy: pages from one domain cannot get any pieces of information from pages on the other domain (unless explicitly allowed).
An attacker can request pages on other domains on your behalf, e.g. by using a <script src=...> or <img> tag, but it can't get any information about the result (headers, contents).
Thus, if you visit an attacker's page, it couldn't read your email from gmail.com.
Except that when using a script tag to request JSON content, the JSON is executed as JavaScript in an attacker's controlled environment. If the attacker can replace the Array or Object constructor or some other method used during object construction, anything in the JSON would pass through the attacker's code, and be disclosed.
Note that this happens when the JSON is executed as JavaScript, not when it's parsed.
There are multiple countermeasures:
Making sure the JSON never executes
By placing a while(1); statement before the JSON data, Google ensures that the JSON data is never executed as JavaScript.
Only a legitimate page could actually get the whole content, strip the while(1);, and parse the remainder as JSON.
Things like for(;;); have been seen on Facebook for instance, with the same results.
Making sure the JSON is not valid JavaScript
Similarly, adding invalid tokens before the JSON, like &&&START&&&, makes sure that it is never executed.
Always return JSON with an Object on the outside
This is OWASP recommended way to protect from JSON hijacking and is the less intrusive one.
Similarly to the previous counter-measures, it makes sure that the JSON is never executed as JavaScript.
A valid JSON object, when not enclosed by anything, is not valid in JavaScript, since the { } gets interpreted as a code block:
eval('{"foo":"bar"}')
// SyntaxError: Unexpected token :
This is however valid JSON:
JSON.parse('{"foo":"bar"}')
// Object {foo: "bar"}
So, make sure you always return an Object at the top level of the response and make sure that the JSON is not valid JavaScript, while still being valid JSON.
As noted by #hvd in the comments, the empty object {} is valid JavaScript, and knowing the object is empty may itself be valuable information.
Comparison of the above methods
The OWASP way is less intrusive, as it needs no client library changes, and transfers valid JSON. It is unsure whether past or future browser bugs could defeat this, however. As noted by #oriadam, it is unclear whether data could be leaked in a parse error through an error handling or not (e.g. window.onerror).
Google's way requires a client library in order for it to support automatic de-serialization and can be considered to be safer with regard to browser bugs.
Both methods require server-side changes in order to avoid developers accidentally sending vulnerable JSON.
This is to ensure some other site can't do nasty tricks to try to steal your data. For example, by replacing the array constructor, then including this JSON URL via a <script> tag, a malicious third-party site could steal the data from the JSON response. By putting a while(1); at the start, the script will hang instead.
A same-site request using XHR and a separate JSON parser, on the other hand, can easily ignore the while(1); prefix.
That would be to make it difficult for a third-party to insert the JSON response into an HTML document with the <script> tag. Remember that the <script> tag is exempt from the Same Origin Policy.
Note: as of 2019, many of the old vulnerabilities that lead to the preventative measures discussed in this question are no longer an issue in modern browsers. I'll leave the answer below as a historical curiosity, but really the whole topic has changed radically since 2010 (!!) when this was asked.
It prevents it from being used as the target of a simple <script> tag. (Well, it doesn't prevent it, but it makes it unpleasant.) That way bad guys can't just put that script tag in their own site and rely on an active session to make it possible to fetch your content.
edit — note the comment (and other answers). The issue has to do with subverted built-in facilities, specifically the Object and Array constructors. Those can be altered such that otherwise innocuous JSON, when parsed, could trigger attacker code.
Since the <script> tag is exempted from the Same Origin Policy which is a security necessity in the web world, while(1) when added to the JSON response prevents misuse of it in the <script> tag.
As this is a High traffic post I hope to provide here an answer slightly more undetermined to the original question and thus provide further background on a JSON Hijacking attack and its consequences
JSON Hijacking as the name suggests is an attack similar to Cross-Site Request Forgery where an attacker can access cross-domain sensitive JSON data from applications that return sensitive data as array literals to GET requests. An example of a JSON call returning an array literal is shown below:
[{"id":"1001","ccnum":"4111111111111111","balance":"2345.15"},
{"id":"1002","ccnum":"5555555555554444","balance":"10345.00"},
{"id":"1003","ccnum":"5105105105105100","balance":"6250.50"}]
This attack can be achieved in 3 major steps:
Step 1: Get an authenticated user to visit a malicious page.
Step 2: The malicious page will try and access sensitive data from the application that the user is logged into. This can be done by embedding a script tag in an HTML page since the same-origin policy does not apply to script tags.
<script src="http://<jsonsite>/json_server.php"></script>
The browser will make a GET request to json_server.php and any authentication cookies of the user will be sent along with the request.
Step 3: At this point, while the malicious site has executed the script it does not have access to any sensitive data. Getting access to the data can be achieved by using an object prototype setter. In the code below an object prototypes property is being bound to the defined function when an attempt is being made to set the "ccnum" property.
Object.prototype.__defineSetter__('ccnum',function(obj){
secrets =secrets.concat(" ", obj);
});
At this point, the malicious site has successfully hijacked the sensitive financial data (ccnum) returned byjson_server.php
JSON
It should be noted that not all browsers support this method; the proof of concept was done on Firefox 3.x.This method has now been deprecated and replaced by the useObject.defineProperty There is also a variation of this attack that should work on all browsers where full-named JavaScript (e.g. pi=3.14159) is returned instead of a JSON array.
There are several ways in which JSON Hijacking can be prevented:
Since SCRIPT tags can only generate HTTP GET requests, they only return JSON objects to POST
requests.
Prevent the web browser from interpreting the JSON object as valid JavaScript code.
Implement Cross-Site Request Forgery protection by requiring that a predefined random value be required for all JSON requests.
so as you can see While(1) comes under the last option. In the most simple terms, while(1) is an infinite loop that will run till a break statement is issued explicitly. And thus what would be described as a lock for the key to be applied (google break statement). Therefore a JSON hijacking, in which the Hacker has no key will be consistently dismissed. Alas, If you read the JSON block with a parser, the while(1) loop is ignored.
So in conclusion, the while(1) loop can more easily be visualized as a simple break statement cypher that google can use to control the flow of data.
However, the keyword in that statement is the word 'simple'. The usage of authenticated infinite loops has been thankfully removed from basic practice in the years since 2010 due to its absolute decimation of CPU usage when isolated (and the fact the internet has moved away from forcing through crude 'quick-fixes'). Today instead the codebase has embedded preventative measures, and the system is not crucial or effective anymore. (part of this is the move away from JSON Hijacking to more fruitful data farming techniques that I won't go into at present)
After authentication is in place, JSON hijacking protection can take a
variety of forms. Google appends while(1) into their JSON data, so
that if any malicious script evaluates it, the malicious script enters
an infinite loop.
Reference: Web Security Testing Cookbook: Systematic Techniques to Find Problems Fast