Replace leading spaces with in Javascript - javascript

I have text like the following, with embedded spaces that show indentation of some xml data:
<Style id="KMLStyler"><br>
<IconStyle><br>
<colorMode>normal</colorMode><br>
I need to use Javascript to replace each LEADING space with
so that it looks like this:
<Style id="KMLStyler"><br>
<IconStyle><br>
<colorMode>normal</colorMode><br>
I have tried a basic replace, but it is matching all spaces, not just the leading ones. I want to leave all the spaces alone except the leading ones. Any ideas?

JavaScript does not have the convenient \G (not even look-behinds), so there's no pure regex-solution for this AFAIK. How about something like this:
function foo() {
var leadingSpaces = arguments[0].length;
var str = '';
while(leadingSpaces > 0) {
str += ' ';
leadingSpaces--;
}
return str;
}
var s = " A B C";
print(s.replace(/^[ \t]+/mg, foo));
which produces:
A B C
Tested here: http://ideone.com/XzLCR
EDIT
Or do it with a anonymous inner function (is it called that?) as commented by glebm in the comments:
var s = " A B C";
print(s.replace(/^[ \t]+/gm, function(x){ return new Array(x.length + 1).join(' ') }));
See that in action here: http://ideone.com/3JU52

Use ^ to anchor your pattern at the beginning of the string, or if you'r dealing with a multiline string (ie: embedded newlines) add \n to your pattern. You will need to match the whole set of leading spaces at once, and then in the replacement check the length of what was matched to figure out how many nbsps to insert.

Related

Regex split comma except escaped [duplicate]

I have this string:
a\,bcde,fgh,ijk\,lmno,pqrst\,uv
I need a JavaScript function that will split the string by every , but only those that don't have a \ before them
How can this be done?
Here's the shortest thing I could come up with:
'a\\,bcde,fgh,ijk\\,lmno,pqrst\\,uv'.replace(/([^\\]),/g, '$1\u000B').split('\u000B')
The idea behind is to find every place where comma isn't prefixed with a backslash, replace those with string that is uncommon to come up in your strings and then split by that uncommon string.
Note that backslashes before commas have to be escaped using another backslash. Otherwise, javascript treats form \, as escaped comma and produce simply a comma out of it! In other words if you won't escape the backslash, javascript sees this: a\,bcde,fgh,ijk\,lmno,pqrst\,uv as this a,bcde,fgh,ijk,lmno,pqrst,uv.
Since regular expressions in JavaScript does not support lookbehinds, I'm not going to cook up a giant hack to mimic this behavior. Instead, you can just split() on all commas (,) and then glue back the pieces that shouldn't have been split in the first place.
Quick 'n' dirty demo:
var str = 'a\\,bcde,fgh,ijk\\,lmno,pqrst\\,uv'.split(','), // Split on all commas
out = []; // Output
for (var i = 0, j = str.length - 1; i < j; i++) { // Iterate all but last (last can never be glued to non-existing next)
var curr = str[i]; // This piece
if (curr.charAt(curr.length - 1) == '\\') { // If ends with \ ...
curr += ',' + str[++i]; // ... glue with next and skip next (increment i)
}
out.push(curr); // Add to output
}
Another ugly hack around the lack of look-behinds:
function rev(s) {
return s.split('').reverse().join('');
}
var s = 'a\\,bcde,fgh,ijk\\,lmno,pqrst\\,uv';
// Enter bizarro world...
var r = rev(s);
// Split with a look-ahead
var rparts = r.split(/,(?!\\)/);
// And put it back together with double reversing.
var sparts = [ ];
while(rparts.length)
sparts.push(rev(rparts.pop()));
for(var i = 0; i < sparts.length; ++i)
$('#out').append('<pre>' + sparts[i] + '</pre>');
Demo: http://jsfiddle.net/ambiguous/QbBfw/1/
I don't think I'd do this in real life but it works even if it does make me feel dirty. Consider this a curiosity rather than something you should really use.
In case if need remove backslashes also:
var test='a\\.b.c';
var result = test.replace(/\\?\./g, function (t) { return t == '.' ? '\u000B' : '.'; }).split('\u000B');
//result: ["a.b", "c"]
In 2022 most of browsers support lookbehinds:
https://caniuse.com/js-regexp-lookbehind
Safari should be your only concern.
With a lookbehind you can split your string this way:
"a\\,bcde,fgh,ijk\\,lmno,pqrst\\,uv".split(/(?<!\\),/)
// => ['a\\,bcde', 'fgh', 'ijk\\,lmno', 'pqrst\\,uv']
You can use regex to do the split.
Here is the link to regex in javascript http://www.w3schools.com/jsref/jsref_obj_regexp.asp
Here is the link to other post where the author have used regex for split Javascript won't split using regex
From the first link if you note you can create a regular expression using
?!n Matches any string that is not followed by a specific string n
[,]!\\

Deobfuscating Javascript - How to replace random variable names?

So I'm tackling the task of de-obfuscating some javascript code and using www.jsbeautifier.org I have got my code. Is it possible to make a search and replace query of some sort or another method to replace the random variable names with this actual content e.g:
O7 = "string";
o9 = "test";
function o9(O7) {
.... etc
}
to
function test(string) {
...... etc.
}
Thanks
You can't do this in pure regex, or in language-agnostic, just by the fact that you can't use conditional replacements (or substitutions). Which means you can't do something like:
b(a)?, and say: if a is empty, then replace the whole match to "c"; otherwise, to "d".
Why is it useful? Keep reading to see what we'll be using to match the right text.
Currently, some regex flavors allow you to use different 'variables' within the substitution text.
(e.g.: $n, $', $&, $`...) - Take a look at Substitutions in Regular Expressions.
However, assuming you're deobfuscating Javascript code with Javascript, the regex you're searching for is:
/"[^"]*"|'[^']*'|\/\*[\s\S]*?\*\/|\/\/.*$|\b(<text>)\b/mg
Explanation
If you use it in Regex101, you'll see it's matching the same as \b<text>\b, any other comment
(/* foo */, // bar), and any other quoted text ("baz", 'qux'), which is actually the expected. The first two parts of the regex will be responsible to match any string:
"[^"]*" - matches: "..."
'[^']*' - matches: '...'
And that's okay because we want to exclude the possibility of replacing a 'variable' if it's actually inside the string.
And then the third, which will be responsible for the multiline comments, and the fourth (normal comments) part shall work like this:
\/\*[\s\S]*?\*\/ - matches: /*...*/
\/\/.*$ - matches: //... until the line breaks
And now, the text we'll be searching for, will not simply be matched by the regex, but also will be captured. Take a look at the last part:
\b(<text>)\b - captures the <text> (those that haven't been captured before).
Now, in our script, we can simply match all the occurrences of the desired input, and replace to the output when our code detects that group one ($1) is not empty.
Result (TL;DR)
function deobfuscate(code, from, to){
var re = RegExp('"[^"]*"|\'[^\']*\'|\\/\\*[\\s\\S]*?\\*\\/|\\/\\/.*$|\\b('+ from +')\\b', 'gm');
return code.replace(re, function(match, g1) { return (g1) ? to:match; });
}
With that function, you can do what you want, for example, parsing:
O7 = "string";
o9 = "test";
function o9(O7) { ...
And retrieving <toFind> = <toReplace> in the start, and then use it (inside a loop or something) like this:
code = deobfuscate(code, toFind[i], toReplace[i]);
Working Example
/* Textarea & Inputs' DOMs */
var code = document.getElementById("code");
var from = document.getElementById("from");
var to = document.getElementById("to");
code.placeholder = "Code goes here...";
from.placeholder = "From";
to.placeholder = "To";
/* Example Values */
code.value = "Example: //Switch(?):\n"+
"function o9(o9) { //true, true\n"+
" o9 = 'o9'; //true, false\n"+
" /*\n"+
" o9 //false\n"+
" */\n"+
" var test = o9+\"o9\"+o9; //true, false, true\n"+
" return o9; //o9 //true, false\n"+
"}\n";
from.value = "o9";
to.value = "ok";
/* Called onclick action */
function doStuff(){
code.value = deobfuscate(code.value, from.value, to.value);
}
function deobfuscate(code, from, to){
var re = RegExp('"[^"]*"|\'[^\']*\'|\\/\\*[\\s\\S]*?\\*\\/|\\/\\/.*$|\\b('+ from +')\\b', 'gm');
return code.replace(re, function(match, g1) { return (g1) ? to:match; });
}
<html>
<body>
<textarea id="code" rows="10" cols="55"></textarea> <br>
<input id="from"/> → <input id="to"/> <br><br>
<button onclick="doStuff()">Deobfuscate</button>
</body>
</html>
OBS
As your question is not clear enough, I can't tell what you are searching for, a lot is possible. For example, should it search for random variables in the code and then replace it all? That would require a dictionary for words, so wouldn't really deobfuscate the code, as you must specify the input and output. If there's something you think I'm missing, please add it to the comment section.

Regex match cookie value and remove hyphens

I'm trying to extract out a group of words from a larger string/cookie that are separated by hyphens. I would like to replace the hyphens with a space and set to a variable. Javascript or jQuery.
As an example, the larger string has a name and value like this within it:
facility=34222%7CConner-Department-Store;
(notice the leading "C")
So first, I need to match()/find facility=34222%7CConner-Department-Store; with regex. Then break it down to "Conner Department Store"
var cookie = document.cookie;
var facilityValue = cookie.match( REGEX ); ??
var test = "store=874635%7Csomethingelse;facility=34222%7CConner-Department-Store;store=874635%7Csomethingelse;";
var test2 = test.replace(/^(.*)facility=([^;]+)(.*)$/, function(matchedString, match1, match2, match3){
return decodeURIComponent(match2);
});
console.log( test2 );
console.log( test2.split('|')[1].replace(/[-]/g, ' ') );
If I understood it correctly, you want to make a phrase by getting all the words between hyphens and disallowing two successive Uppercase letters in a word, so I'd prefer using Regex in that case.
This is a Regex solution, that works dynamically with any cookies in the same format and extract the wanted sentence from it:
var matches = str.match(/([A-Z][a-z]+)-?/g);
console.log(matches.map(function(m) {
return m.replace('-', '');
}).join(" "));
Demo:
var str = "facility=34222%7CConner-Department-Store;";
var matches = str.match(/([A-Z][a-z]+)-?/g);
console.log(matches.map(function(m) {
return m.replace('-', '');
}).join(" "));
Explanation:
Use this Regex (/([A-Z][a-z]+)-?/g to match the words between -.
Replace any - occurence in the matched words.
Then just join these matches array with white space.
Ok,
first, you should decode this string as follows:
var str = "facility=34222%7CConner-Department-Store;"
var decoded = decodeURIComponent(str);
// decoded = "facility=34222|Conner-Department-Store;"
Then you have multiple possibilities to split up this string.
The easiest way is to use substring()
var solution1 = decoded.substring(decoded.indexOf('|') + 1, decoded.length)
// solution1 = "Conner-Department-Store;"
solution1 = solution1.replace('-', ' ');
// solution1 = "Conner Department Store;"
As you can see, substring(arg1, arg2) returns the string, starting at index arg1 and ending at index arg2. See Full Documentation here
If you want to cut the last ; just set decoded.length - 1 as arg2 in the snippet above.
decoded.substring(decoded.indexOf('|') + 1, decoded.length - 1)
//returns "Conner-Department-Store"
or all above in just one line:
decoded.substring(decoded.indexOf('|') + 1, decoded.length - 1).replace('-', ' ')
If you want still to use a regular Expression to retrieve (perhaps more) data out of the string, you could use something similar to this snippet:
var solution2 = "";
var regEx= /([A-Za-z]*)=([0-9]*)\|(\S[^:\/?#\[\]\#\;\,']*)/;
if (regEx.test(decoded)) {
solution2 = decoded.match(regEx);
/* returns
[0:"facility=34222|Conner-Department-Store",
1:"facility",
2:"34222",
3:"Conner-Department-Store",
index:0,
input:"facility=34222|Conner-Department-Store;"
length:4] */
solution2 = solution2[3].replace('-', ' ');
// "Conner Department Store"
}
I have applied some rules for the regex to work, feel free to modify them according your needs.
facility can be any Word built with alphabetical characters lower and uppercase (no other chars) at any length
= needs to be the char =
34222 can be any number but no other characters
| needs to be the char |
Conner-Department-Store can be any characters except one of the following (reserved delimiters): :/?#[]#;,'
Hope this helps :)
edit: to find only the part
facility=34222%7CConner-Department-Store; just modify the regex to
match facility= instead of ([A-z]*)=:
/(facility)=([0-9]*)\|(\S[^:\/?#\[\]\#\;\,']*)/
You can use cookies.js, a mini framework from MDN (Mozilla Developer Network).
Simply include the cookies.js file in your application, and write:
docCookies.getItem("Connor Department Store");

match word not capitalized a certain way

I want a regular expression that matches all instances of "capitalizedExactlyThisWay" that are not capitalizedExactlyThisWay.
I created a function that finds the indexes of all case insensitive matches and then pushes the values back in like this (JSBIN)
But I would rather just say something like text.replace(regexp,"<highlight>$1</highlight>");
replace has a callback function too.
s = s.replace(reg1, function(m){
if(m===word) return m;
return '<highlight>'+m+'</highlight>';
});
Unfortunately JavaScript regular expressions do not support making only a part of the expression case-insensitive.
You could write a little helper function that does the dirty work:
function capitalizationSensitiveRegex(word) {
var chars = word.split(""), i;
for (i = 0; i < chars.length; i++) {
chars[i] = "[" + chars[i].toLowerCase() + chars[i].toUpperCase() + "]";
}
return new RegExp("(?=\\b" + chars.join("") + "\\b)(?!" + word + ").{" + word.length + "}", "g");
}
Result:
capitalizationSensitiveRegex("capitalizedExactlyThisWay");
=> /(?=\b[cC][aA][pP][iI][tT][aA][lL][iI][zZ][eE][dD][eE][xX][aA][cC][tT][lL][yY][tT][hH][iI][sS][wW][aA][yY]\b)(?!capitalizedExactlyThisWay).{25}/g
Note that this assumes ASCII letters due to limitations of how \b works in JavaScript. It also assumes you're not using any regex meta characters in word (brackets, backslashes, parentheses, stars, dots, etc). An extra step of regex-quoting each char is necessary to make the above stable.
You can use match and map method with a callback:
tok=[], input.match(/\bcapitalizedexactlythisway\b/ig).map( function (m) {
if (m!="capitalizedExactlyThisWay") tok.push(m); });
console.log( tok );
["capitalizedEXACTLYTHISWAY", "capitalizedexactlYthisWay", "capitalizedexactlythisway"]
You could try this regex to match all the case-insensitive exactlythisway string but not of ExactlyThisWay ,
\bcapitalized(?!ExactlyThisWay)(?:[Ee][Xx][Aa][Cc][Tt][Ll][Yy][Tt][Hh][Ii][Ss][Ww][Aa][Yy])\b
Demo
If you could somehow get JavaScript to work with partial case-insensitive matching, i.e. (?i), you could use the following expression:
capitalized(?!ExactlyThisWay)(?i)exactlythisway
If not, you're probably stuck with something like this:
capitalized(?!ExactlyThisWay)[a-zA-Z]+
The downside is that it will also match other variations such as capitalizedfoobar etc.
Demo

Javascript replace hypens with space

I am getting this value from DatePicker
var datepickr = 'Jun-29-2011';
I want to replace underscores(-) with space .
I tried this way , but it isn't working
var b = datepickr.replace("-",' ');
Just for reference:
var datepickr = 'Jun-29-2011';
datepickr.replace("-", " "); // returns "Jun 29-2011"
datepickr.replace(/-/, " "); // returns "Jun 29-2011"
datepickr.replace(/-/g, " "); // returns "Jun 29 2011" (yay!)
The difference is the global modifier /g, which causes replace to search for all instances. Note also that - must be escaped as \- when it could also be used to denote a range. For example, /[a-z]/g would match all lower-case letters, whereas /[a\-z]/g would match all a's, z's and dashes. In this case it's unambiguous, but it's worth noting.
EDIT
Just so you know, you can do it in one line without regex, it's just impressively unreadable:
while (str !== (str = str.replace("-", " "))) { }
.replace is supposed to take a regular expression:
var b = datepickr.replace(/-/g,' ');
I'll leave it as an exercise to the reader to research regular expressions to the full.
(The important bit here, though, is the flag /g — global search)
Try this:
var datepickr = 'Jun-29-2011';
var b = datepickr.replace( /-/g, ' ' );
The /g causes it to replace every -, not just the first one.
var b = 'Jun-29-2011'.replace(/-/g, ' ');
Or:
var b = 'Jun-29-2011'.split('-').join(' ');
replace works with regular expressions, like so:
> "Hello-World-Hi".replace(/-/g, " ")
Hello World Hi
try:
var b = datepickr.toString().replace("-",' ');
I suspect that you are trying to replace chars inside a Date object.

Categories

Resources