Why is my search() unable to find special characters? - javascript

I think this is a character encoding problem. My javascript search fails to identify a string whenever the string contains certain special characters such as ()* parenthesis, asterisks, and numerals.
The javascript is fairly simple. I search a string (str) for the value (val):
n = str.search(val);
The string I search was written to a txt file using PHP...
$scomment = $_POST['order'];
$data = stripslashes("$scomment\n");
$fh = fopen("comments.txt", "a");
fwrite($fh, $data);
fclose($fh);
...then retrieved with PHP...
$search = $_GET["name"];
$comments = file_get_contents('comments.txt');
$array = explode("~",$comments);
foreach($array as $item){if(strstr($item, $search)){
echo $item; } }
...and put into my HTML using AJAX...
xmlhttp.open("GET","focus.php?name="+str,true);
xmlhttp.send();
My HTML is encoded as UTF-8.
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
The problem is that the when I search() for strings, special characters are not recognized.
I didn't really see an option to encode the PHP created text, so I assumed it was UTF-8 too. Is that assumption incorrect?
If this is an encoding problem, how do I change the encoding of the txt file written with PHP? If it is caused by the above code, where is the problem?

The search() function takes a regex. If a string is given, it is converted into one. Because of this, any special characters have special meanings, and it will not work as expected. You need to escape special characters.
Syntax
str.search(regexp)
Parameters
regexp
A regular expression object. If a non-RegExp object obj is passed, it is implicitly converted to a RegExp by using new RegExp(obj).
If you want to search for text, use str.indexOf() instead, unless you NEED a regex. If so, see here on how to escape the string: Is there a RegExp.escape function in Javascript?

Include this function in your code:
function isValid(str){
return !/[~`!#$%\^&*+=\-\[\]\\';,/{}|\\":<>\?]/g.test(str);
}
Hope it helps

Related

JavaScript functions breaking with comma in string from PHP [duplicate]

This question already has answers here:
How do I pass variables and data from PHP to JavaScript?
(19 answers)
Closed 8 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
What is the easiest way to encode a PHP string for output to a JavaScript variable?
I have a PHP string which includes quotes and newlines. I need the contents of this string to be put into a JavaScript variable.
Normally, I would just construct my JavaScript in a PHP file, à la:
<script>
var myvar = "<?php echo $myVarValue;?>";
</script>
However, this doesn't work when $myVarValue contains quotes or newlines.
Expanding on someone else's answer:
<script>
var myvar = <?php echo json_encode($myVarValue); ?>;
</script>
Using json_encode() requires:
PHP 5.2.0 or greater
$myVarValue encoded as UTF-8 (or US-ASCII, of course)
Since UTF-8 supports full Unicode, it should be safe to convert on the fly.
Note that because json_encode escapes forward slashes, even a string that contains </script> will be escaped safely for printing with a script block.
encode it with JSON
function escapeJavaScriptText($string)
{
return str_replace("\n", '\n', str_replace('"', '\"', addcslashes(str_replace("\r", '', (string)$string), "\0..\37'\\")));
}
I have had a similar issue and understand that the following is the best solution:
<script>
var myvar = decodeURIComponent("<?php echo rawurlencode($myVarValue); ?>");
</script>
However, the link that micahwittman posted suggests that there are some minor encoding differences. PHP's rawurlencode() function is supposed to comply with RFC 1738, while there appear to have been no such effort with Javascript's decodeURIComponent().
The paranoid version: Escaping every single character.
function javascript_escape($str) {
$new_str = '';
$str_len = strlen($str);
for($i = 0; $i < $str_len; $i++) {
$new_str .= '\\x' . sprintf('%02x', ord(substr($str, $i, 1)));
}
return $new_str;
}
EDIT: The reason why json_encode() may not be appropriate is that sometimes, you need to prevent " to be generated, e.g.
<div onclick="alert(???)" />
<script>
var myVar = <?php echo json_encode($myVarValue); ?>;
</script>
or
<script>
var myVar = <?= json_encode($myVarValue) ?>;
</script>
Micah's solution below worked for me as the site I had to customise was not in UTF-8, so I could not use json; I'd vote it up but my rep isn't high enough.
function escapeJavaScriptText($string)
{
return str_replace("\n", '\n', str_replace('"', '\"', addcslashes(str_replace("\r", '', (string)$string), "\0..\37'\\")));
}
Don't run it though addslashes(); if you're in the context of the HTML page, the HTML parser can still see the </script> tag, even mid-string, and assume it's the end of the JavaScript:
<?php
$value = 'XXX</script><script>alert(document.cookie);</script>';
?>
<script type="text/javascript">
var foo = <?= json_encode($value) ?>; // Use this
var foo = '<?= addslashes($value) ?>'; // Avoid, allows XSS!
</script>
You can insert it into a hidden DIV, then assign the innerHTML of the DIV to your JavaScript variable. You don't have to worry about escaping anything. Just be sure not to put broken HTML in there.
You could try
<script type="text/javascript">
myvar = unescape('<?=rawurlencode($myvar)?>');
</script>
Don’t. Use Ajax, put it in data-* attributes in your HTML, or something else meaningful. Using inline scripts makes your pages bigger, and could be insecure or still allow users to ruin layout, unless…
… you make a safer function:
function inline_json_encode($obj) {
return str_replace('<!--', '<\!--', json_encode($obj));
}
htmlspecialchars
Description
string htmlspecialchars ( string $string [, int $quote_style [, string $charset [, bool $double_encode ]]] )
Certain characters have special significance in HTML, and should be represented by HTML entities if they are to preserve their meanings. This function returns a string with some of these conversions made; the translations made are those most useful for everyday web programming. If you require all HTML character entities to be translated, use htmlentities() instead.
This function is useful in preventing user-supplied text from containing HTML markup, such as in a message board or guest book application.
The translations performed are:
* '&' (ampersand) becomes '&'
* '"' (double quote) becomes '"' when ENT_NOQUOTES is not set.
* ''' (single quote) becomes ''' only when ENT_QUOTES is set.
* '<' (less than) becomes '<'
* '>' (greater than) becomes '>'
http://ca.php.net/htmlspecialchars
I'm not sure if this is bad practice or no, but my team and I have been using a mixed html, JS, and php solution. We start with the PHP string we want to pull into a JS variable, lets call it:
$someString
Next we use in-page hidden form elements, and have their value set as the string:
<form id="pagePhpVars" method="post">
<input type="hidden" name="phpString1" id="phpString1" value="'.$someString.'" />
</form>
Then its a simple matter of defining a JS var through document.getElementById:
<script type="text/javascript" charset="UTF-8">
var moonUnitAlpha = document.getElementById('phpString1').value;
</script>
Now you can use the JS variable "moonUnitAlpha" anywhere you want to grab that PHP string value.
This seems to work really well for us. We'll see if it holds up to heavy use.
If you use a templating engine to construct your HTML then you can fill it with what ever you want!
Check out XTemplates.
It's a nice, open source, lightweight, template engine.
Your HTML/JS there would look like this:
<script>
var myvar = {$MyVarValue};
</script>

Convert Unicode Javascript string to PHP utf8 string

I make form with input text.
<input type="text" id="input" value=""/>
i received utf-8 string from web like this (with javascript, jquery)
var str = '\u306e\u7c21\u5358\u306a\u8aac\u660e';
str is 'の簡単な説明'.
set input field value to 'str'
$('#input').val(str);
this input replace all of escape string '\' and set string like this.
<input type"text" id="input" value="u306eu7c21u5358u306au8aacu660e"/>
no problem in this point. display character is also good.
But.
When I save this string into my database with PHP
PHP put this value non-escaped utf8 string 'u306eu7c21u5358u306au8aacu660e' to database
and next time I've call
<input type="text" id="input" value="<?=$str?>">
and browser displays raw value
just 'u306eu7c21u5358u306au8aacu660e'
not 'の簡単な説明'
I don't know what is wrong.
I've tried
$str = json_decode("\"".$str."\"");
html_entity_decode(...);
mb_convert_encoding(...);
but not working correctly...
How can I covert this non-escaped utf-8 string to general utf-8 string?
You've MUST have MultiByte String support. With some extra work here is what you need:
<?php
$str = 'u306eu7c21u5358u306au8aacu660e';
function converter($sequence) {
return mb_convert_encoding(pack('H*', $sequence), 'UTF-8', 'UCS-2BE');
}
# array_filter is not important here at all it just "remove" empty strings
$converted = array_map('converter', array_filter(explode('u', $str)));
$converted = join('', $converted);
print $converted;
Just as a side note you OUGHT TO FIND a better strategy in order to
split the unicode sequences. By "exploding" string by u char is
somewhat ingenuo.
Also, I strongly advise you read the excelent blog post by Armin Ronacher, UCS vs UTF-8 as Internal String Encoding.

Regex works with JavaScript but not PHP

I used this regex in the JavaScript for my webpage, and it worked perfectly:
var nameRegex = /^([ \u00c0-\u01ffa-zA-Z'\-])+$/;
return nameRegex.test(name);
I then wanted to include it in my PHP script as a second barrier in case the user disables JavaScript etc. But whenever I use it it will fail every string that I pass through it, even correct names.
I tried using single quotes to stop escape characters, but then I had to escape the single quote contained within the regex, and came up with this:
$nameRegex = '/^([ \u00c0-\u01ffa-zA-Z\'\-])+$/';
if ($firstName == ""){
$valSuccess = FALSE;
$errorMsgTxt .= "Please enter your first name<br>\n";
} elseif (!preg_match($nameRegex, $firstName)){
$valSuccess = FALSE;
$errorMsgTxt .= "Please enter a valid first name<br>\n";
}
But, once again, it fails valid names.
So my question is, how can I make my regex "safe" for use in PHP?
The problem with your regular expression is that this works in javascript, but your syntax is not valid in pcre.
You need to consider \X which matches a single Unicode grapheme, whether encoded as a single code point or multiple code points using combining marks. The correct syntax would be..
/^[ \X{00c0-01ff}a-zA-Z'-]+$/

How do I replace a double-quote with an escape-char double-quote in a string using JavaScript?

Say I have a string variable (var str) as follows-
Dude, he totally said that "You Rock!"
Now If I'm to make it look like as follows-
Dude, he totally said that "You Rock!"
How do I accomplish this using the JavaScript replace() function?
str.replace("\"","\\""); is not working so well. It gives unterminated string literal error.
Now, if the above sentence were to be stored in a SQL database, say in MySQL as a LONGTEXT (or any other VARCHAR-ish) datatype, what else string optimizations I need to perform?
Quotes and commas are not very friendly with query strings. I'd appreciate a few suggestions on that matter as well.
You need to use a global regular expression for this. Try it this way:
str.replace(/"/g, '\\"');
Check out regex syntax and options for the replace function in Using Regular Expressions with JavaScript.
Try this:
str.replace("\"", "\\\""); // (Escape backslashes and embedded double-quotes)
Or, use single-quotes to quote your search and replace strings:
str.replace('"', '\\"'); // (Still need to escape the backslash)
As pointed out by helmus, if the first parameter passed to .replace() is a string it will only replace the first occurrence. To replace globally, you have to pass a regex with the g (global) flag:
str.replace(/"/g, "\\\"");
// or
str.replace(/"/g, '\\"');
But why are you even doing this in JavaScript? It's OK to use these escape characters if you have a string literal like:
var str = "Dude, he totally said that \"You Rock!\"";
But this is necessary only in a string literal. That is, if your JavaScript variable is set to a value that a user typed in a form field you don't need to this escaping.
Regarding your question about storing such a string in an SQL database, again you only need to escape the characters if you're embedding a string literal in your SQL statement - and remember that the escape characters that apply in SQL aren't (usually) the same as for JavaScript. You'd do any SQL-related escaping server-side.
The other answers will work for most strings, but you can end up unescaping an already escaped double quote, which is probably not what you want.
To work correctly, you are going to need to escape all backslashes and then escape all double quotes, like this:
var test_str = '"first \\" middle \\" last "';
var result = test_str.replace(/\\/g, '\\\\').replace(/\"/g, '\\"');
depending on how you need to use the string, and the other escaped charaters involved, this may still have some issues, but I think it will probably work in most cases.
var str = 'Dude, he totally said that "You Rock!"';
var var1 = str.replace(/\"/g,"\\\"");
alert(var1);

jQuery.parseJSON throws “Invalid JSON” error due to escaped single quote in JSON

I’m making requests to my server using jQuery.post() and my server is returning JSON objects (like { "var": "value", ... }). However, if any of the values contains a single quote (properly escaped like \'), jQuery fails to parse an otherwise valid JSON string. Here’s an example of what I mean (done in Chrome’s console):
data = "{ \"status\": \"success\", \"newHtml\": \"Hello \\\'x\" }";
eval("x = " + data); // { newHtml: "Hello 'x", status: "success" }
$.parseJSON(data); // Invalid JSON: { "status": "success", "newHtml": "Hello \'x" }
Is this normal? Is there no way to properly pass a single quote via JSON?
According to the state machine diagram on the JSON website, only escaped double-quote characters are allowed, not single-quotes. Single quote characters do not need to be escaped:
Update - More information for those that are interested:
Douglas Crockford does not specifically say why the JSON specification does not allow escaped single quotes within strings. However, during his discussion of JSON in Appendix E of JavaScript: The Good Parts, he writes:
JSON's design goals were to be minimal, portable, textual, and a subset of JavaScript. The less we need to agree on in order to interoperate, the more easily we can interoperate.
So perhaps he decided to only allow strings to be defined using double-quotes since this is one less rule that all JSON implementations must agree on. As a result, it is impossible for a single quote character within a string to accidentally terminate the string, because by definition a string can only be terminated by a double-quote character. Hence there is no need to allow escaping of a single quote character in the formal specification.
Digging a little bit deeper, Crockford's org.json implementation of JSON for Java is more permissible and does allow single quote characters:
The texts produced by the toString methods strictly conform to the JSON syntax rules. The constructors are more forgiving in the texts they will accept:
...
Strings may be quoted with ' (single quote).
This is confirmed by the JSONTokener source code. The nextString method accepts escaped single quote characters and treats them just like double-quote characters:
public String nextString(char quote) throws JSONException {
char c;
StringBuffer sb = new StringBuffer();
for (;;) {
c = next();
switch (c) {
...
case '\\':
c = this.next();
switch (c) {
...
case '"':
case '\'':
case '\\':
case '/':
sb.append(c);
break;
...
At the top of the method is an informative comment:
The formal JSON format does not allow strings in single quotes, but an implementation is allowed to accept them.
So some implementations will accept single quotes - but you should not rely on this. Many popular implementations are quite restrictive in this regard and will reject JSON that contains single quoted strings and/or escaped single quotes.
Finally to tie this back to the original question, jQuery.parseJSON first attempts to use the browser's native JSON parser or a loaded library such as json2.js where applicable (which on a side note is the library the jQuery logic is based on if JSON is not defined). Thus jQuery can only be as permissive as that underlying implementation:
parseJSON: function( data ) {
...
// Attempt to parse using the native JSON parser first
if ( window.JSON && window.JSON.parse ) {
return window.JSON.parse( data );
}
...
jQuery.error( "Invalid JSON: " + data );
},
As far as I know these implementations only adhere to the official JSON specification and do not accept single quotes, hence neither does jQuery.
If you need a single quote inside of a string, since \' is undefined by the spec, use \u0027 see http://www.utf8-chartable.de/ for all of them
edit: please excuse my misuse of the word backticks in the comments. I meant backslash. My point here is that in the event you have nested strings inside other strings, I think it can be more useful and readable to use unicode instead of lots of backslashes to escape a single quote. If you are not nested however it truly is easier to just put a plain old quote in there.
I understand where the problem lies and when I look at the specs its clear that unescaped single quotes should be parsed correctly.
I am using jquery`s jQuery.parseJSON function to parse the JSON string but still getting the parse error when there is a single quote in the data that is prepared with json_encode.
Could it be a mistake in my implementation that looks like this (PHP - server side):
$data = array();
$elem = array();
$elem['name'] = 'Erik';
$elem['position'] = 'PHP Programmer';
$data[] = json_encode($elem);
$elem = array();
$elem['name'] = 'Carl';
$elem['position'] = 'C Programmer';
$data[] = json_encode($elem);
$jsonString = "[" . implode(", ", $data) . "]";
The final step is that I store the JSON encoded string into an JS variable:
<script type="text/javascript">
employees = jQuery.parseJSON('<?=$marker; ?>');
</script>
If I use "" instead of '' it still throws an error.
SOLUTION:
The only thing that worked for me was to use bitmask JSON_HEX_APOS to convert the single quotes like this:
json_encode($tmp, JSON_HEX_APOS);
Is there another way of tackle this issue? Is my code wrong or poorly written?
Thanks
When You are sending a single quote in a query
empid = " T'via"
empid =escape(empid)
When You get the value including a single quote
var xxx = request.QueryString("empid")
xxx= unscape(xxx)
If you want to search/ insert the value which includes a single quote in a query
xxx=Replace(empid,"'","''")
Striking a similar issue using CakePHP to output a JavaScript script-block using PHP's native json_encode. $contractorCompanies contains values that have single quotation marks and as explained above and expected json_encode($contractorCompanies) doesn't escape them because its valid JSON.
<?php $this->Html->scriptBlock("var contractorCompanies = jQuery.parseJSON( '".(json_encode($contractorCompanies)."' );"); ?>
By adding addslashes() around the JSON encoded string you then escape the quotation marks allowing Cake / PHP to echo the correct javascript to the browser. JS errors disappear.
<?php $this->Html->scriptBlock("var contractorCompanies = jQuery.parseJSON( '".addslashes(json_encode($contractorCompanies))."' );"); ?>
I was trying to save a JSON object from a XHR request into a HTML5 data-* attribute. I tried many of above solutions with no success.
What I finally end up doing was replacing the single quote ' with it code ' using a regex after the stringify() method call the following way:
var productToString = JSON.stringify(productObject);
var quoteReplaced = productToString.replace(/'/g, "'");
var anchor = '<a data-product=\'' + quoteReplaced + '\' href=\'#\'>' + productObject.name + '</a>';
// Here you can use the "anchor" variable to update your DOM element.
Interesting. How are you generating your JSON on the server end? Are you using a library function (such as json_encode in PHP), or are you building the JSON string by hand?
The only thing that grabs my attention is the escape apostrophe (\'). Seeing as you're using double quotes, as you indeed should, there is no need to escape single quotes. I can't check if that is indeed the cause for your jQuery error, as I haven't updated to version 1.4.1 myself yet.

Categories

Resources