I'm migrating the front-end of a site from an old YUI2 framework to jQuery/BackBone. The PHP/mySQL back-end hasn't changed. All is well, except UTF-8 characters sent via Backbone save (via $.ajax) are getting mangled and I can't figure out why.
Here's what I do know:
The backend handles UTF-8 fine. It hasn't changed as part of this rebuild. I know that's true, because when I change the config to load the old YUI2 front-end, UTF-8 characters work fine. They're escaped in Javascript using escape(string), passed via YAHOO.util.Connect.asyncRequest as JSON in an XMLHttpRequest, unescaped and saved in the database as UTF-8, fully readable and nice.
In the new front-end, I've added <meta charset="UTF-8"> and <meta http-equiv="content-type" content="text/html; charset=UTF-8"> to all page headers. The old front-end didn't have these settings. I only mention that because it's a difference.
In the new front-end, UTF-8 characters work fine when I save them as a <form> submit.
I the new front-end, the request Content-Type looks fine in the console. Content-Type:application/x-www-form-urlencoded; charset=UTF-8
How am I passing data in the new front-end?
Sometimes via a regular Backbone model.save(), other times passing data in options like this:
var text = $('#input-' + targetId).val();
var atts = {};
atts['target_id'] = targetId;
atts['user_id'] = userId;
atts['text'] = text;
var comment = new Comment(atts);
comment.save(
{},
{
type: 'POST',
url: '/api/comment?',
data: atts,
processData: true,
success: function(comment, response){
//success handling
},
error: function(model, response){
//error handling
},
},
);
So, what do these mangled special characters look like?
As entered in the input: テクス テクサン テクス テクサン
When I pass completely unescaped, they look fine in the request in the console in the Form Data section: text: テクス テクサン テクス テクサン, but mangled in the database as ãã¯ã¹ ãã¯ãµã³ ãã¯ã¹ ãã¯ãµã³. Perhaps this is a clue, I don't know. I've always escaped user-entered text when passing via AJAX.
When I escape(text), I get text:%u30C6%u30AF%u30B9%20%u30C6%u30AF%u30B5%u30F3%20%u30C6%u30AF%u30B9%20%u30C6%u30AF%u30B5%u30F3 in the console, and テクス%20テクサン%20テクス%20テクサン in the database.
That's better, but it's different from the old front end, which uses escape(text), passes %u30C6%u30AF%u30B9%20%u30C6%u30AF%u30B5%u30F3%20%u30C6%u30AF%u30B9%20%u30C6%u30AF%u30B5%u30F3, shows in the console as text: (unable to decode value) and saves in the database unescaped as テクス テクサン テクス テクサン
Of course, it's 2016 now and we all know escape() should not be used. We should use encodeURIComponent() instead. So, when I encodeURIComponent(text), here's what I get in the console: text: %E3%83%86%E3%82%AF%E3%82%B9%20%E3%83%86%E3%82%AF%E3%82%B5%E3%83%B3%20%E3%83%86%E3%82%AF%E3%82%B9%20%E3%83%86%E3%82%AF%E3%82%B5%E3%83%B3 which is saved in the database as %E3%83%86%E3%82%AF%E3%82%B9%20%E3%83%86%E3%82%AF%E3%82%B5%E3%83%B3%20%E3%83%86%E3%82%AF%E3%82%B9%20%E3%83%86%E3%82%AF%E3%82%B5%E3%83%B3 That technically works, and I can always decodeURIComponent when displaying this text, but that's a real pain and it's just masking the issue.
I've also tried unescape(encodeURIComponent(text)) with the following result: text:ãã¯ã¹ ãã¯ãµã³ ãã¯ã¹ ãã¯ãµã³ in the console, ãÂÂã¯ã¹ ãÂÂã¯ãµã³ ãÂÂã¯ã¹ ãÂÂã¯ãµã³ in the database.
It seems that there's some sort of double-encoding going on, or perhaps the back-end was built to handle the specific format that's passed via the YUI2 Async request. I don't know.
Any ideas for what I should try next? What are the best practices?
Now that I've had a night to sleep on it, I've realized a few things and I think I've found a solution.
It's clear now that the old front-end wasn't passing data correctly...that's evidenced by the text: (unable to decode value) in the console when sending the request. Somehow, the PHP back-end was able to handle the passed text even though there was no decoding in the api or db storage classes. That's a mystery for another day.
Here's what I did to fix the problem:
Pass text from the front-end as encodeURIComponent(text)
Decode the text in the PHP back-end api using $comment->set_text(urldecode(Request::get('text')));
The text is stored in the DB unescaped as readable UTF-8 characters and I don't need to do anything special on read/display. I will need to add the urldecode to all of my api endpoints on the back-end, but that feels like a solid approach, so I think it's resolved.
I'd be interested to hear thoughts on the use of encodeURIComponent on the front-end and urldecode on the back-end. Is this the best way to solve the problem?
Related
I worked on a mvc program where on a click of a button I send data inside a TextArea to the controller. Typically my code worked as expected but this was not the case when the data inside the TextArea consisted of a html based data.
Html Code:
<textarea id="bodyInfo"> </textarea>
<button onclick="Submit()" id="submitInfo">Create Notification</button>
ajax:
function Submit() {
$.ajax({
url: '#Url.Action("GetResults", "notification")',
type: 'GET',
data: { body: $('#bodyInfo').val()),
cache: 'false',
dataType: 'html',
success: function (result) {
$('#resultsTblInfo').html(result);
}
});
return false;
}
Sample Example of TextArea data:
<table style="width:100%">
<tbody>
<tr>
<td class="ellipses-title" style="color:#22557f;font-size:15px;font-weight:bold;margin-bottom:10px;margin-top:7px;border-right:1px solid #BFF1FD;text-align:right;padding-right:15px;"> The New Admin is Coming</td>
<td class="ellipses-text" style="padding-left:15px;" valign="center">Hello<br> Hello2</a>.
</td>
</tr>
</tbody>
</table>
What was seen above would work when TextArea was not html data but raw text but would fail when it was html data.
I was able to make this work by using
"encodeURIComponent($('#bodyInfo').val())" instead of "$('#bodyInfo').val()"
In the controller side, doing
"body = HttpUtility.UrlDecode(body)";
Are there better alternatives to achieve the same thing? Am I using encodeURIComponent against its intended nature? Why does $('#bodyInfo').val() not work when passing html values through ajax? If this question is a duplicate I would appreciate it if someone could provide me a link (i tried searching through google but found no satisfying answer)
That's exactly what you'd use encodeURIComponent for. The idea is to encode the (potentially problematic) value when sending it, then decoding it server-side in order to use it however it is need.
In fact, there's an example of this on MDN's encodeURIComponent documentation page:
To avoid unexpected requests to the server, you should call encodeURIComponent on any user-entered parameters that will be passed as part of a URI. For example, a user could type "Thyme &time=again" for a variable comment. Not using encodeURIComponent on this variable will give comment=Thyme%20&time=again. Note that the ampersand and the equal sign mark a new key and value pair.
I think that in your textarea html data, there was a "&" character.
is so, the problem is that the character & cannot be sent to a server. this is because when sending data, the browser gathers data, separed by & character http://example.php?a=something&b=other. so, if there is & character in your data the browser will not be able to distinguish if it is a data or URI component.
The common solution is to transform your data to base64 text before it is sent. this way you can decode it in your server using base64_decode fonction (case of php). you can base64 encode your data using window.btoa function in javascript. another solution is what you used.
a third solution is to replace & chars by something you know it will never exist in you textarea like [##alpha].
hope it helps
Is possible to encode or obfuscate or something else, what Im sending via websockets? Example message:
var msg = {
type: "message",
data: {
message: "Hello world!"
}
}
I need to send this message as unreadable and at server-side I need decode it back to readable version.
I want block it from chrome console Netword tab (WS).
The data is generated on the user side, so you can't really hide the data from him.
But you can encode it anyway you want before sending just to hide it in the chrome console like this:
var msg = {
type: "message",
data: {
message: btoa("Hello world!")
}
At the server side you just need to:
atob(message);
Also look here
Keep in mind: A motivated user can grab the data before the encoding happens, or even later and just make atob() by himself. there is not a 100% way to block it, it's the client data.
If you want to make his life harder, you can try to use crypto libraries with RSA, but again, he can capture it before the crypto begins.
I have a very short piece of PHP that I use to make HTTP requests from JavaScript.
<?php echo file_get_contents($_GET['url']); ?>
I have used it successfully in a few projects, but am running into a problem with making requests in my current project. Based on my searching, I believe it may be caused by the underscore in the request, though through my searching and not knowing PHP, I have not been able to confirm that.
Below is an example of what I am doing from JavaScript:
$.get("grabber.php?url=" + "http://tidesandcurrents.noaa.gov/api/datagetter?station=8573364&begin_date=20160202&end_date=20160203&product=predictions&units=english&time_zone=gmt&format=json&application=poseidonweathercom+&datum=MLLW", function(forecast) {
console.log(forecast);
});
If I copy the url and put in it in a browser, I get back the JSON that I requested. When I use the code above, I end up getting an error message from NOAA:
Wrong Product : Product cannot be null or empty Wrong Time zone: Time zone cannot be null or empty Wrong Unit:Unit cannot be null or empty Wrong Format: Format cannot be null or empty Wrong Date: The beginDate cannot be null or empty
Do I need to use a regex for the underscore in PHP? Is there some other issue that I do not understand?
Thanks.
You need to send it encoded, which will convert all the underscores/spaces/ampersands etc. with their encoded equivalents:
var url = "http://tidesandcurrents.noaa.gov/api/datagetter?station=8573364&begin_date=20160202&end_date=20160203&product=predictions&units=english&time_zone=gmt&format=json&application=poseidonweathercom+&datum=MLLW";
$.get("grabber.php?url=" + encodeURIComponent(url), function(forecast){
console.log(forecast);
}
Using encodeURIComponent() on that URL shows:
http%3A%2F%2Ftidesandcurrents.noaa.gov%2Fapi%2Fdatagetter%3Fstation%3D8573364%26begin_date%3D20160202%26end_date%3D20160203%26product%3Dpredictions%26units%3Denglish%26time_zone%3Dgmt%26format%3Djson%26application%3Dposeidonweathercom%2B%26datum%3DMLLW
Alternatively, if you just want to access the JSON data and handle it within the JavaScript function, you can retrieve the data via the URL directly, without having to encode the URL:
$.get("http://tidesandcurrents.noaa.gov/api/datagetter?station=8573364&begin_date=20160202&end_date=20160203&product=predictions&units=english&time_zone=gmt&format=json&application=poseidonweathercom+&datum=MLLW", function(forecast) {
console.log(forecast);
});
Um why do you even need your php code ... the code below will work just fine and eliminate your server overhead.
$.get("http://tidesandcurrents.noaa.gov/api/datagetter?station=8573364&begin_date=20160202&end_date=20160203&product=predictions&units=english&time_zone=gmt&format=json&application=poseidonweathercom+&datum=MLLW", function(forecast) {
console.log(forecast);
});
I have a data javascript file, which is being dynamically added to website via some custom code.
This file comes from a third party vendor, who could potentially add malicious code in the file
Before this file is added to the website, I would like to parse through it, and look for malicious code, such as redirects or alerts, that inherently get executed upon a files inclusion in the project/website.
For example, my js file could look like this :
alert ('i am malicious');
var IAmGoodData =
[
{ Name :'test', Type:'Test2 },
{ Name :'test1', Type:'Test21' },
{ Name :'test2', Type:'Test22' }
]
I load this file into a object via a XMLHttpRequest call, and when this call returns, I can use the variable (which is my file text) and search it for words:
var client = new XMLHttpRequest();
client.open('GET', 'folder/fileName.js');
client.onreadystatechange = function()
{
ScanText(client.responseText);
}
client.send();
function ScanText(text)
{
alert(text);
var index = text.search('alert'); //Here i can search for keywords
}
The last line would return index of 0, as the word alert is found at index 0 in the file.
Questions:
Is there a more efficient way to search for keywords in the file?
What specific keywords should i be searching for to prevent malicious code being run? ie redirects, popups, sounds etc.....
Instead of having them include var IAmGoodData =, make them simply provide JSON (which is basically what the rest of the file is, or seems to be). Then you parse it as JSON, using JSON.parse(). If it fails, they either didn't follow the JSON format well, or have external code, and in either case you would ignore the response.
For example, you'd expect data from the external file like:
[
{ Name :'test', Type:'Test2' },
{ Name :'test1', Type:'Test21' },
{ Name :'test2', Type:'Test22' }
]
which needs to be properly serialized as JSON (double quotes instead of single quotes, and double quotes around the keys). In your code, you'd use:
var json;
try {
json = JSON.parse(client.responseText);
catch (ex) {
// Invalid JSON
}
if (json) {
// Do something with the response
}
Then you could loop over json and access the Name and Type properties of each.
Random Note:
In your client.onreadystatechange callback, make sure you check client.readyState === 4 && client.status === 200, to know that the request was successful and is done.
This is extremely difficult to do. There are no intrinsically malicious keywords or functions in JavaScript, there are malicious applications. You could be getting false positives for "malicious" activity and prevent a legitimate code with a real purpose from being executed. And at the same time, anyone with a little bit of imagination could bypass any "preventive" method you may implement.
I'd suggest you look for a different approach. This is one of those problems (like CAPTCHA) in which it's trivial for a human to solve while for a machine is practically impossible to do so. You could try having a moderator or some human evaluator to interpret the code and accept it.
You should have them provide valid JSON rather than arbitrary Javascript.
You can then call JSON.parse() to read their data without any risk of code execution.
In short, data is not code, and should not be able to contain code.
You shouldn't. The user should be allowed to type whatever they want, and it's your job to display it.
It all depends on where it is being put, of course:
Database: mysql_real_escape_string or equivalent for whatever engine you're using.
HTML: htmlspecialchars in PHP, createTextNode or .replace(/</g,"<") in JavaScript
JavaScript: json_encode in PHP, JSON.stringify in JavaScript.
At the end of the day, just don't be Yahoo
I am working with JMeter to write some performance tests. One of the things that I need to do is to construct a huge json request dynamically and send it as POST request paylod. Using BSF preprocessor, I am able to modify the payload dynamically however my javascript string is being encoded, while I want to send it without being encoded.
I am not sure how BSF preprocessor can stop it from being encoded. The command I currently use to change my POST request payload is as follows:
var jsonData = '[{"item":"value","something":"everything"}]';
sampler.addArgument("",jsonData);
I would really appreciate if you can point me to some examples which clearly explain how bsf preprocessors are expected to be used.
Any pointers to skip the encoding will also be appreciated.
Since JMeter 2.6 you can use the RAW request pane using Post Body tab.
So your solution is to do the following:
In BSF Sampler, put you JSON in a variable:
var jsonData = '[{"item":"value","something":"everything"}]';
vars.putObject("jsonData",jsonData);
In Post Body, put:
${jsonData}
Another option using your method is to put in BSFPreProcessor using Beanshell language (not javascript):
import org.apache.jmeter.protocol.http.util.HTTPArgument;
String jsonData = "[{\"item\":\"value\",\"something\":\"everything\"}]";
HTTPArgument arg =new HTTPArgument("", jsonData, null, true);
arg.setAlwaysEncoded(false);
sampler.getArguments().addArgument(arg);
Regards
Philippe M.
set property on your sampler "HTTPArgument.always_encode" to false this should disable argument encoding