Escape control and non-ASCII characters (high-bit) in JavaScript string? - javascript

Is there an easy way in javascript to only escape certain ranges of control characters?
Specifically, I want to escape just the ranges "\x00-\x1f" and "\x7f-\xff" (control and high-bit characters). I need to escape only those characters before calculating a hash that is then sent to an API. The standard functions like encodeURI() and escape() escape too much.
Basically, I need to match the functionality of perl's uri_escape($text,"\x00-\x1f\x7f-\xff") .

function uri_escape( text, re ) {
function pad( num ) {
return num.length < 2 ? "0" + num : num;
}
return text.replace( re, function(v){
return "%"+pad(v.charCodeAt(0).toString(16)).toUpperCase();
});
}
uri_escape( "\u0015\u0012", /[\x00-\x1f\x7f-\xff]/g );
//"%15%12"

Related

JavaScript Regex: How to let only 2 decimal floating number through oninput pattern match?

I have an input type="text" with an input event handler attached to it, which i would like to use to overwrite the input.value with a "filtered" result of the users' key presses.
Basically i want to write a regular expression that only allows floating numbers (positive or negative) with (optionally) 2 decimal positions.
Here's a scripted example of what i'm looking for.
If you hit any key while focusing the input in the example above, the input value will be filtered using a combination of regex and JavaScript.
My test input currently looks like this:
<input type="text" id="test" value="-3c.fvbnj23.79.8fbak-cfb436.sdafosd8r32798s.hdf-5kasjfds-gf..">
The input event handler looks like this:
document.getElementById('test').oninput = function(){
var foundDot = false,
str = '';
this.value.replace(/^-|\.(?=\d)|\d*/g, function(match){
if (match === '.' && foundDot === true) {
return;
}
if (match === '.' && foundDot === false) foundDot = true;
str += match;
});
this.value = parseFloat(str).toFixed(2);
}
Is it possible to obtain the same result with a regular expressions only?
Even better, it can be done without regex at all.
Well, okay, one regex. I misunderstood that you wanted to preserve digits within non-numeric strings.
All right, fiiiiine, two regexes. ¯\_(ツ)_/¯
//oninput is too aggressive, it runs on every keystroke making it difficult to type in longer values.
document.getElementById('test').onchange = function(){
// strip non-numeric characters.
// First char can be -, 0-9, or .
// Others can only be 0-9 or .
var val = this.value.replace(/(?!^[\-\d\.])[^\d\.]/g,'');
// strip out extra dots. Here I admit regex defeat and resort to slice-and-dice:
var firstDot = this.value.indexOf('.');
if (firstDot > -1) {
val = val.substr(0,firstDot+1) + val.substring(firstDot+1).replace(/\./g,'')
}
console.log(val);
// Finally the easy part: set the precision
this.value = parseFloat(val).toFixed(2);
}
<input id="test">
I don't know why you can't just use a find/replace on each keystroke.
Find ^([-+]?(?:\d+(?:\.\d{0,2})?|\.\d{0,2})?)
Replace $1
Expanded
^
( # (1 start)
[-+]?
(?:
\d+
(?:
\. \d{0,2}
)?
|
\. \d{0,2}
)?
) # (1 end)

Javascript , encodeURI failed to encode round bracket "("

I have cookie value which contains round bracket " e.g: demo (1)"
When I try to encode with encodeURI , the round bracket ( is not encoded to %28 , what is the alternative to encode the special characters like round brackets
encodeURI() encodes special characters, except: , / ? : # & = + $ #.
One can use encodeURIComponent() to encode the above character.
You can write custom method to encode ( to %28.
Example :
var uri = "my test.asp?(name";
var res = encodeURI(uri);
res.replace("(", "%28");
As pointed out in the comment below, string#replace will remove the first occurrence, one can use string#replaceAll i.e. res.replaceAll("(", "%28") or string#replace with global flag i.e. res.replace(/\(/g, "%28") to remove all occurrences.
const uri = "my test.asp?(n(a(m(e",
res = encodeURI(uri);
console.log(res.replaceAll("(", "%28"));
NOTE :
encodeURI() will not encode: ~!##$&*()=:/,;?+'
encodeURIComponent() will not encode: ~!*()'
To encode uri components to be RFC 3986 -compliant - which encodes the characters !'()* - you can use:
function fixedEncodeURIComponent(str) {
return encodeURIComponent(str).replace(/[!'()*]/g, function(c) {
return '%' + c.charCodeAt(0).toString(16);
});
}
Taken from just before Examples-section at: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent
For reference, see: https://www.rfc-editor.org/rfc/rfc3986
encodeURI only encodes reserved characters so this function should not be expected to encode parentheses.
You could write your own function to encode all the characters in the string, or just create a custom list of characters you want to encode.
Fiddle
function superEncodeURI(url) {
var encodedStr = '', encodeChars = ["(", ")"];
url = encodeURI(url);
for(var i = 0, len = url.length; i < len; i++) {
if (encodeChars.indexOf(url[i]) >= 0) {
var hex = parseInt(url.charCodeAt(i)).toString(16);
encodedStr += '%' + hex;
}
else {
encodedStr += url[i];
}
}
return encodedStr;
}
Based on encodeURIComponent docs by Mozilla
encodeURIComponent escapes all characters except:
A-Z a-z 0-9 - _ . ! ~ * ' ( )
So, the only characters we don't want to scape are: A-Z a-z 0-9.
So this function does it:
function encodeUriAll(value) {
return value.replace(/[^A-Za-z0-9]/g, c =>
`%${c.charCodeAt(0).toString(16).toUpperCase()}`
);
}

match word not capitalized a certain way

I want a regular expression that matches all instances of "capitalizedExactlyThisWay" that are not capitalizedExactlyThisWay.
I created a function that finds the indexes of all case insensitive matches and then pushes the values back in like this (JSBIN)
But I would rather just say something like text.replace(regexp,"<highlight>$1</highlight>");
replace has a callback function too.
s = s.replace(reg1, function(m){
if(m===word) return m;
return '<highlight>'+m+'</highlight>';
});
Unfortunately JavaScript regular expressions do not support making only a part of the expression case-insensitive.
You could write a little helper function that does the dirty work:
function capitalizationSensitiveRegex(word) {
var chars = word.split(""), i;
for (i = 0; i < chars.length; i++) {
chars[i] = "[" + chars[i].toLowerCase() + chars[i].toUpperCase() + "]";
}
return new RegExp("(?=\\b" + chars.join("") + "\\b)(?!" + word + ").{" + word.length + "}", "g");
}
Result:
capitalizationSensitiveRegex("capitalizedExactlyThisWay");
=> /(?=\b[cC][aA][pP][iI][tT][aA][lL][iI][zZ][eE][dD][eE][xX][aA][cC][tT][lL][yY][tT][hH][iI][sS][wW][aA][yY]\b)(?!capitalizedExactlyThisWay).{25}/g
Note that this assumes ASCII letters due to limitations of how \b works in JavaScript. It also assumes you're not using any regex meta characters in word (brackets, backslashes, parentheses, stars, dots, etc). An extra step of regex-quoting each char is necessary to make the above stable.
You can use match and map method with a callback:
tok=[], input.match(/\bcapitalizedexactlythisway\b/ig).map( function (m) {
if (m!="capitalizedExactlyThisWay") tok.push(m); });
console.log( tok );
["capitalizedEXACTLYTHISWAY", "capitalizedexactlYthisWay", "capitalizedexactlythisway"]
You could try this regex to match all the case-insensitive exactlythisway string but not of ExactlyThisWay ,
\bcapitalized(?!ExactlyThisWay)(?:[Ee][Xx][Aa][Cc][Tt][Ll][Yy][Tt][Hh][Ii][Ss][Ww][Aa][Yy])\b
Demo
If you could somehow get JavaScript to work with partial case-insensitive matching, i.e. (?i), you could use the following expression:
capitalized(?!ExactlyThisWay)(?i)exactlythisway
If not, you're probably stuck with something like this:
capitalized(?!ExactlyThisWay)[a-zA-Z]+
The downside is that it will also match other variations such as capitalizedfoobar etc.
Demo

Is there a python strip function equivalent in javascript?

Python's strip function is used to remove given characters from the beginning and end of the string. How to create a similar function in javascript?
Example:
str = "'^$ *# smart kitty & ''^$* '^";
newStr = str.strip(" '^$*#&");
console.log(newStr);
Output:
smart kitty
There's lodash's trim()
Removes leading and trailing whitespace or specified characters from string.
_.trim(' abc '); // → 'abc'
_.trim('-_-abc-_-', '_-'); // → 'abc'
A simple but not very effective way would be to look for the characters and remove them:
function strip(str, remove) {
while (str.length > 0 && remove.indexOf(str.charAt(0)) != -1) {
str = str.substr(1);
}
while (str.length > 0 && remove.indexOf(str.charAt(str.length - 1)) != -1) {
str = str.substr(0, str.length - 1);
}
return str;
}
A more effective, but not as easy to use, would be a regular expression:
str = str.replace(/(^[ '\^\$\*#&]+)|([ '\^\$\*#&]+$)/g, '')
Note: I escaped all characters that have any special meaning in a regular expression. You need to do that for some characters, but perhaps not all the ones that I escaped here as they are used inside a set. That's mostly to point out that some characters do need escaping.
Modifying a code snippet from Mozilla Developer Network String.prototype.trim(), you could define such a function as follows.
if (!String.prototype.strip) {
String.prototype.strip = function (string) {
var escaped = string.replace(/([.*+?^=!:${}()|\[\]\/\\])/g, "\\$1");
return this.replace(RegExp("^[" + escaped + "]+|[" + escaped + "]+$", "gm"), '');
};
}
It's not necessary and probably not advisable to put this function in the object String.prototype, but it does give you a clear indication of how such a function compares with the existing String.prototype.trim().
The value of escaped is as in the function escapeRegExp in the guide to Regular Expressions. The Java programming language has a standard library function for that purpose, but JavaScript does not.
Not exactly... I would use regex for complicated string manipulation or the Slice() method to remove characters at certain points
Slice() explained

startswith in javascript error

I'm using startswith reg exp in Javascript
if ((words).match("^" + string))
but if I enter the characters like , ] [ \ /, Javascript throws an exception.
Any idea?
If you're matching using a regular expression you must make sure you pass a valid Regular Expression to match(). Check the list of special characters to make sure you don't pass an invalid regular expression. The following characters should always be escaped (place a \ before it): [\^$.|?*+()
A better solution would be to use substr() like this:
if( str === words.substr( 0, str.length ) ) {
// match
}
or a solution using indexOf is a (which looks a bit cleaner):
if( 0 === words.indexOf( str ) ) {
// match
}
next you can add a startsWith() method to the string prototype that includes any of the above two solutions to make usage more readable:
String.prototype.startsWith = function(str) {
return ( str === this.substr( 0, str.length ) );
}
When added to the prototype you can use it like this:
words.startsWith( "word" );
One could also use indexOf to determine if the string begins with a fixed value:
str.indexOf(prefix) === 0
If you want to check if a string starts with a fixed value, you could also use substr:
words.substr(0, string.length) === string
If you really want to use regex you have to escape special characters in your string. PHP has a function for it but I don't know any for JavaScript. Try using following function that I found from [Snipplr][1]
function escapeRegEx(str)
{
var specials = new RegExp("[.*+?|()\\[\\]{}\\\\]", "g"); // .*+?|()[]{}\
return str.replace(specials, "\\$&");
}
and use as
var mystring="Some text";
mystring=escapeRegEx(mystring);
If you only need to find strings starting with another string try following
String.prototype.startsWith=function(string) {
return this.indexOf(string) === 0;
}
and use as
var mystring="Some text";
alert(mystring.startsWith("Some"));

Categories

Resources