So, I have a HTML which is written into a perl string. This html represents a template and I need to add fields on runtime.
For example:
$templateHTML.= '<span > %{name} </span>
I want to replace the %{name} with the required value.
The regex I have tried is:
$htmlTemplate.=~ s/%{name}/akhil;
This didn't work, also is there a way I can use JavaScript's replace function, i.e, can I convert the perl string to js string and process it?
On request, the template is invoked and the values to be added are passed as parameters.
This solved:
my $find = "%{name}";
my $replace = "had";
$find = quotemeta $find; # escape regex metachars if present
$str =~ s/$find/$replace/g;
Source : http://www.perlmonks.org/?node_id=98357
Giving the values directly, didn't work. I am not sure why, will look up and get back.
You're inventing your own templating system. And it seems unlikely that you'll invent something as flexible or powerful as the ones we already have. So I'd recommend you use something like the Template Toolkit instead.
But if you want to continue with your plan, you should read the relevant section from the FAQ.
How can I expand variables in text strings?
(contributed by brian d foy)
If you can avoid it, don't, or if you can use a templating system, such as Text::Template or Template Toolkit, do that instead. You might even be able to get the job done with sprintf or printf:
my $string = sprintf 'Say hello to %s and %s', $foo, $bar;
However, for the one-off simple case where I don't want to pull out a full templating system, I'll use a string that has two Perl scalar variables in it. In this example, I want to expand $foo and $bar to their variable's values:
my $foo = 'Fred';
my $bar = 'Barney';
$string = 'Say hello to $foo and $bar';
One way I can do this involves the substitution operator and a double /e flag. The first /e evaluates $1 on the replacement side and turns it into $foo. The second /e starts with $foo and replaces it with its value. $foo, then, turns into 'Fred', and that's finally what's left in the string:
$string =~ s/(\$\w+)/$1/eeg; # 'Say hello to Fred and Barney'
The /e will also silently ignore violations of strict, replacing undefined variable names with the empty string. Since I'm using the /e flag (twice even!), I have all of the same security problems I have with eval in its string form. If there's something odd in $foo, perhaps something like #{[ system "rm -rf /" ]}, then I could get myself in trouble.
To get around the security problem, I could also pull the values from a hash instead of evaluating variable names. Using a single /e, I can check the hash to ensure the value exists, and if it doesn't, I can replace the missing value with a marker, in this case ??? to signal that I missed something:
my $string = 'This has $foo and $bar';
my %Replacements = (
foo => 'Fred',
);
# $string =~ s/\$(\w+)/$Replacements{$1}/g;
$string =~ s/\$(\w+)/
exists $Replacements{$1} ? $Replacements{$1} : '???'
/eg;
print $string;
Related
Main Question: Should escaped backslashes also be stored in the database for Javascript and how well that would play with PHP's regex engine?
Details
I have a number of regex patterns which can be used to classify strings into various categories. An example is as below:
(^A)|(\(A)
This can recognize for example an "A" in the start of the string or if it is immediately after an opening bracket ( but not if it is anywhere else in the string.
DBC(ABC)AA
ABC(DBC)AA
My project uses these regex patterns in two languages PHP and Javascript.
I want to store these patterns in a MySQL database and since there is no datatype for regex, I thought I could store it as VARCHAR or TEXT.
The issue arises if I directly use strings in Javascript, the \( is counted only as ( as the \ backslash is used as an escape character. if this is used to create new RegExp it gives an error:
Uncaught SyntaxError: unterminated parenthetical
For example:
let regexstring = "(^A)|(\(A)";
console.log(regexstring); // outputs => "(^A)|((A)"
let regex = new RegExp(regexstring); // Gives Uncaught SyntaxError: unterminated parenthetical
Based on this answer in StackOverflow, the solution is to escape the backslashes like:
let regexstring = "(^A)|(\\(A)";
console.log(regexstring); // Outputs => "(^A)|(\\(A)"
regex = new RegExp(regexstring);
The question is therefore, should escaped backslashes also be stored in the database and how well that would play with PHP's regex engine?
I would store the raw regular expression.
The additional escape character is not actually part of the regex. It's there for JS to process the string correctly, because \ has a special meaning. You need to specify it when writing the string as "hardcoded" text. In fact, it would also be needed in the PHP side, if you were to use the same assignment technique in PHP, you would write it with the escape backslash:
$regexstring = "(^A)|(\\(A)";
You could also get rid of it if you changed the way you initialize regexstring in your JS:
<?
...
$regexstring = $results[0]["regexstring"];
?>
let regexstring = decodeURIComponent("<?=rawurlencode($regexstring);?>");
console.log(regexstring);
Another option is to just add the escaping backslashes in the PHP side:
<?
...
$regexstring = $results[0]["regexstring"];
$escapedRegexstring = str_replace('\', '\\', $regexstring);
?>
let regexstring = "<?=$escapedRegexstring;?>";
However, regardless of escaping, you should note that there are other differences in syntax between PHP's regex engine and the one used by JS, so you may end up having to maintain two copies anyway.
Lastly, if these regex expressions are meant to be provided by users, then keep in mind that outputting them as-is into JS code is very dangerous as it can easily cause an XSS vulnerability. The first method, of passing it through rawurlencode (in the PHP side) and decodeURIComponent (in the JS side) - should eliminate this risk.
I found in this site a very basic javascript function to encode text. Looking at the source code this is the string replacement code:
txtEscape = txtEscape.replace(/%/g,'#');
So the string stackoverflow becomes #73#74#61#63#6B#6F#76#65#72#66#6C#6F#77
I need a function that does the same elementary encryption in php, but I really don't understand what the /%/g does. I think in php the same function would be something like:
str_replace(/%/g,"#","stackoverflow");
But of course the /%/g doesn't work
Replace a character
Indeed, the PHP function is str_replace (there are many functions for replacements). But, the regex expression is not the same :)
See official documentation: http://php.net/manual/en/function.str-replace.php
In your case, you want to replace a letter % by #.
g is a regex flag. And // are delimiter to activate the regex mode :)
The "g" flag indicates that the regular expression should be tested against all possible matches in a string. Without the g flag, it'll only test for the first.
<?php
echo str_replace('%', '#', '%73%74%61%63%6B%6F%76%65%72%66%6C%6F%77');
?>
In PHP, you can use flags with regex: preg_replace & cie.
Escape
See this post: PHP equivalent for javascript escape/unescape
There are two functions stringToHex and hexToString to do what you want :)
Indeed, the site you provided use espace function to code the message:
document.write(unescape(str.replace(/#/g,'%')));
I have an URL that javascript reads from an user input.
Here is a part of javascript code:
document.getElementById("Snd_Cont_AddrLnk_BG").value=encodeURI(document.getElementById("Con_AddresWeb_BG").value.toString());
Then I post the value of the string through CGI to a Perl Script (here is a part of perl code):
#!/usr/bin/perl -w
##
##
use strict;
use CGI;
use CGI::Carp qw ( fatalsToBrowser );
use URI::Escape;
my $C_AddrLnk_BG=$query->param("Snd_Cont_AddrLnk_BG");
my $lst_upload_dir="../data";
my $lst_file_bg=$lst_upload_dir."/contacts_bg.js";
open(JSBG,">$lst_file_bg") || die "Failed to open $lst_file_bg\n";
printf JSBG "var GMapLink_bg=\"".uri_unescape($C_AddrLnk_BG)."\";\n";
close JSBG;
system("chmod 777 $lst_file_bg");
Somewhere in uri_unescape a problem occurs:
The original string from input is:
https://www.google.bg/maps/place/42%C2%B044'15.0%22N+23%C2%B019'04.2%22E/#42.7368454,23.317962,16z/data=!4m2!3m1!1s0x0:0x0
The string after javascript encodeURI() is:
https://www.google.bg/maps/place/42%25C2%25B044'15.0%2522N+23%25C2%25B019'04.2%2522E/#42.7368454,23.317962,16z/data=!4m2!3m1!1s0x0:0x0
And the script after perl uri_unescape() that is printed in file is:
https://www.google.bg/maps/place/42%C2%B044'15.0%22N+23%C2%B019'04.2 0.000000E+00/#42.7368454,23.317962,16z/data=!4m2!3m1!1s0x0:0x0
I can not ascertain whether the problem is in unescaping or printing, but the part
%2522E
is interpreted as
0.000000E+00
(with 10 leading spaces).
Can anyone help me with an idea of what I was doing wrong?
There are numerous problems with your code.
document.getElementById("Snd_Cont_AddrLnk_BG").value =
encodeURI(document.getElementById("Con_AddresWeb_BG").value.toString());
I can't figure out when you think encodeURI here. All you should have is the following:
document.getElementById("Snd_Cont_AddrLnk_BG").value =
document.getElementById("Con_AddresWeb_BG").value;
printf JSBG "var GMapLink_bg=\"".uri_unescape($C_AddrLnk_BG)."\";\n";
Now the erroneous encodeURI is removed, uri_unescape needs to be removed too.
Furthermore, adding quotes around text doesn't always make it a valid JavaScript literal. The easiest way to do that is as follows:
use JSON qw( );
my $json = JSON->new()->allow_nonref();
$json->encode($C_AddrLnk_BG)
That snippet also misuses printf. printf takes a format parameter, so you want
printf FH "%s", ...
or simply
print FH ...
So what you end up with is:
use JSON qw( );
my $json = JSON->new()->allow_nonref();
$json->encode($C_AddrLnk_BG)
print JSBG "var GMapLink_bg=" . $json->encode($C_AddrLnk_BG) ."\n";
You are using printf instead of print to output the result of uri_unescape. It is interpreting %22E as an engineering-format floating point field with a width of 22. Presumable you have nothing else in the printf parameter list, so undef is being evaluated as zero, resulting in 0.000000E+00.
If you had use warnings in place as you should, you would see messages like Missing argument in printf
I've the following string (meaning it's not numeric):
"0870490055012000000000"
wich is always 22 characters long. I need to transform into:
"087.049.0055.0120.0000.0000"
Using PHP or even on js/client side.
I found something like this but was not able to solve the problem.
I guess there are many ways to solve this. I'm just asking for something like:
$x= format("00000", "xxx.xxx..xxx.x.x.x")
or
$x = preg_replace("/a;;w.;w;e;ew")
with PHP, you can use this:
$str = preg_replace('~\A\d{3}\K\d{3}|\d{4}~', '.$0', $str);
where \A is an anchor for the start of the string.
\K removes all that have been matched on the left from match result.
If you need something more general to apply a mask to a string, the link you shared in your question will give you the way to do.
I want to replace some specific letters (got from user input) to replace with some specific html tags like <b>,<u>,<i>,etc. I am using some regexps in javascript, but can not make out which use best. I am using
/\[u\](.*?)\[u\]/g // replace with <u>$1</u>
/*
* if i type [u]underline[][u] //this allows '[]' braces
*/
or should I use
/\[u\]\([^\[u\]]+)\[u\]/g // this doesn't allow third braces to be underlined
I am also using the same regexps in php. I am confused which type of regexp use would be safe from xss attack.
No regexes should be used. Find a decent bbcode parser (for instance, PHP's BBCode) and use it. trying to parse HTML or any established markup language with Regex yourself is asking for pain, trouble, and insecurity.
bobince wrote an epic answer about parsing HTML with regexes, which is relevant here as well and always worth a read.
You asked, whether to use /\[u\](.*?)\[u\]/g or /\[u\]\([^\[u\]]+)\[u\]/g. Both patterns are not designed with an ending-tag, which is important. [u]underlined text[/u] is BBCode
A solution using extended regex could be the use of recursive patterns. I think there is no support in JavaScript yet, but works fine e.g with PHP which uses PCRE.
The problem: Tags can be nested and this will make it difficult, to match the outermost ones.
Understand, what the following patterns do in this PHP example:
$str =
'The [u][u][u]young[/u] quick[/u] brown[/u] fox jumps over the [u]lazy dog[/u]';
1.) Matching any character in [u]...[/u] using the dot non-greedy
$pattern = '~\[u\](.*?)\[/u\]~';
$str = preg_replace($pattern, '<u>\1</u>', $str);
echo htmlspecialchars($str);
outputs:
The <u>[u][u]young</u> quick[/u] brown[/u] fox jumps over the <u>lazy dog</u>
Looks for the first occurence of [u] and eats up as few characters as possible to meet the conditional [/u] which results in tag-mismatches. So this is a bad choice.
2.) Using negation of square brackets [^[\]] for what is inside [u]...[/u]
$pattern = '~\[u\]([^[\]]*)\[/u\]~';
$str = preg_replace($pattern, '<u>\1</u>', $str);
echo htmlspecialchars($str);
outputs:
The [u][u]<u>young</u> quick[/u] brown[/u] fox jumps over the <u>lazy dog</u>
It looks for the first occurence of [u] followed by any amount of characters, that are not [ or ] to meet the conditional [/u]. It is "safer" as it only matches the innermost elements but still would require additonal effort to resolve this from inside out.
3.) Using recursion + negation of square brackets [^[\]] for what is inside [u]...[/u]
$pattern = '~\[u\]((?:[^[\]]+|(?R))*)\[/u\]~';
$str = preg_replace($pattern, '<u>\1</u>', $str);
echo htmlspecialchars($str);
outputs:
The <u>[u][u]young[/u] quick[/u] brown</u> fox jumps over the <u>lazy dog</u>
Similar to the the second pattern: Look for the first occurence of [u] but then EITHER match one or more characters, that are not [ or ] OR paste the whole pattern at (?R). Do the whole thing zero or more times until the conditional [/u] is matched.
To get rid of the remaining bb-tags inside, that were not resolved, we now can easily remove them:
$str = preg_replace('~\[/?u\]~',"",$str);
And got it as desired:
outputs:
The <u>young quick brown</u> fox jumps over the <u>lazy dog</u>
For sure there are different ways achieving it, like preg replace callback or for JavaScript the replace() method that can use a callback as replacement.