Dynamic replace in regular expression scope - javascript

I need to rewrite some require paths in JavaScript source files:
Example (foo => ../../../foo/baz):
var a = require('foo/a'); => var b = require('../../../foo/baz/a');
var a = require('foo/../b'); => var b = require('../../../foo/baz/../b');
Note: This replacement will be done on a complete js source files. So require(' and ') must be used as delimiter!
So far we have figured out to use some setup like this:
var source = '';
source += "var a = require('foo/a');\n";
source += "var b = require('foo/../b');\n";
source += "console.log(a + b);";
var options = {
'foo': '../../../foo/baz'
};
for (var key in options) {
var regex = new RegExp('require[(](\"|\')' + key, 'g');
source = source.replace(regex, "require('" + options[key]);
}
console.log(source);
Though above source code is working. I am not sure if this is save as I am just skipping the closing delimiter.

I think this does it:
str = str.replace(/require\((['"])([^'"]*)foo\/([^'"]*)(['"])/g, "require($1$2../../../foo/baz/$3$4");
Here's that regex live: http://regex101.com/r/bE5jI4
Explanation:
require matches the characters require literally (case sensitive)
\( matches the character ( literally
1st Capturing group (['"])
['"] match either ' or " literally
2nd Capturing group ([^'"]*)
[^'"]* match a single character not present in the list below
Quantifier: Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
'" a single character in the list '" literally
foo matches the characters foo literally (case sensitive)
\/ matches the character / literally
3rd Capturing group ([^'"]*)
[^'"]* match a single character not present in the list below
Quantifier: Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
'" a single character in the list '" literally
4th Capturing group (['"])
['"] match ' or " literally
You may have to tweak it if there's optional whitespace before the opening quotes, or if your paths may contain ' or " characters. (In that latter case, you'll need two replacements, one when the wrapper quotes are ' and the other when they're ".)

This should work:
var source = '';
source += "var a = require('foo/a');\n";
source += "var b = require('foo/../b');\n";
source += "console.log(a + b);";
var options = {
'foo': '../../../foo/baz'
};
for (var key in options) {
var regex = new RegExp('(require)\\((["\'])(' + key + ')([^"\']*)\\2\\)', 'g');
source = source.replace(regex, "$1('" + options[key] + "$4')");
}
console.log(source);
OUTPUT:
var a = require('../../../foo/baz/a');
var b = require('../../../foo/baz/../b');
console.log(a + b);

Related

String that doesn't contain character group

I wrote regex for finding urls in text:
/(http[^\s]+)/g
But now I need same as that but that expression doesn't contain certain substring, for instance I want all those urls which doesn't contain word google.
How can I do that?
Here is a way to achieve that:
http:\/\/(?!\S*google)\S+
See demo
JS:
var re = /http:\/\/(?!\S*google)\S+/g;
var str = 'http://ya.ru http://yahoo.com http://google.com';
var m;
while ((m = re.exec(str)) !== null) {
document.getElementById("r").innerHTML += m[0] + "<br/>";
}
<div id="r"/>
Regex breakdown:
http:\/\/ - a literal sequence of http://
(?!\S*google) - a negative look-ahead that performs a forward check from the current position (i.e. right after http://), and if it finds 0-or-more-non-spaces-heregoogle the match will be cancelled.
\S+ - 1 or more non-whitespace symbols (this is necessary since the lookahead above does not really consume the characters it matches).
Note that if you have any punctuation after the URL, you may add \b right at the end of the pattern:
var re1 = /http:\/\/(?!\S*google)\S+/g;
var re2 = /http:\/\/(?!\S*google)\S+\b/g;
document.write(
JSON.stringify(
'http://ya.ru, http://yahoo.com, http://google.com'.match(re1)
) + "<br/>"
);
document.write(
JSON.stringify(
'http://ya.ru, http://yahoo.com, http://google.com'.match(re2)
)
);

Extract string when preceding number or combo of preceding characters is unknown

Here's an example string:
++++#foo+bar+baz++#yikes
I need to extract foo and only foo from there or a similar scenario.
The + and the # are the only characters I need to worry about.
However, regardless of what precedes foo, it needs to be stripped or ignored. Everything else after it needs to as well.
try this:
/\++#(\w+)/
and catch the capturing group one.
You can simply use the match() method.
var str = "++++#foo+bar+baz++#yikes";
var res = str.match(/\w+/g);
console.log(res[0]); // foo
console.log(res); // foo,bar,baz,yikes
Or use exec
var str = "++++#foo+bar+baz++#yikes";
var match = /(\w+)/.exec(str);
alert(match[1]); // foo
Using exec with a g modifier (global) is meant to be used in a loop getting all sub matches.
var str = "++++#foo+bar+baz++#yikes";
var re = /\w+/g;
var match;
while (match = re.exec(str)) {
// In array form, match is now your next match..
}
How exactly do + and # play a role in identifying foo? If you just want any string that follows # and is terminated by + that's as simple as:
var foostring = '++++#foo+bar+baz++#yikes';
var matches = (/\#([^+]+)\+/g).exec(foostring);
if (matches.length > 1) {
// all the matches are found in elements 1 .. length - 1 of the matches array
alert('found ' + matches[1] + '!'); // alerts 'found foo!'
}
To help you more specifically, please provide information about the possible variations of your data and how you would go about identifying the token you want to extract even in cases of differing lengths and characters.
If you are just looking for the first segment of text preceded and followed by any combination of + and #, then use:
var foostring = '++++#foo+bar+baz++#yikes';
var result = foostring.match(/[^+#]+/);
// will be the single-element array, ['foo'], or null.
Depending on your data, using \w may be too restrictive as it is equivalent to [a-zA-z0-9_]. Does your data have anything else such as punctuation, dashes, parentheses, or any other characters that you do want to include in the match? Using the negated character class I suggest will catch every token that does not contain a + or a #.

How to replace css('background-image')

I want to replace css('background-image') path.
The problem:
for the same variable oldBgImg = this.element.css('background-image')
FireFox returns -
"url("http://mySite/images/file1.png")"
but Chrome returns it without the quotes:
"url(http://mySite/images/file1.png)"
Here is the solution I use. can you please help me make it simpler?
var oldBgImg = this.element.css('background-image');
// => FF: "url("http://mySite/images/file1.png")"
// Chrome: "url(http://mySite/images/file1.png)"
// According to http://www.w3.org/TR/CSS2/syndata.html#value-def-uri :
// quotes are optional, so Chrome does not use them, but FF does . . .
var n1 = oldBgImg.lastIndexOf("("); n1 += 1; // now points to the char after the "("
var n2 = oldBgImg.lastIndexOf(")"); n2 -= 1; // now points to the char before the ")"
var c1 = oldBgImg.substring(n1, n1 + 1); // test the first Char after the "("
var c2 = oldBgImg.substring(n2, n2 + 1); // test the first Char after the "("
if ( (c1 == "\"") || (c1 == "\'") ) { n1 += 1; }
if ( (c2 == "\"") || (c2 == "\'") ) { n2 -= 1; }
var oldBgImgPath = oldBgImg.substring(n1, n2 + 1); // [ (" ] .. [ ") ]
var n = oldBgImgPath.lastIndexOf("/");
var newBgImgPath = oldBgImgPath.substring(0, n + 1) + "file2.gif";
// if needed, should also add :
// var path = encodeURI(newBgImgPath);
this.element.css('background-image', 'url(' + newBgImgPath + ')');
Notes:
According to http://www.w3.org/TR/CSS2/syndata.html#value-def-uri
one can use single quote or double-quote or no quote sign
I am looking for a general solution, also for relative path (without "http" or with "file") , I just want to replace the fileName within the URL.
Here's an example of how to do it with regular expressions. - live demo
The expression:
("?)(http:.*?)\1\)
The match
url = 'url("http://mySite/images/file1.png")'.match(/("?)(http:.*?)\1\)/)[2];
You can then reconstruct your property.
$(this).css( 'background-image', 'url("' + url + "')" );
This should work on all browsers.
I did it with regular expressions. I use this code:
var re = /url\(['"]?(.+?)[^\/]+['"]?\)/;
var regs = re.exec(oldBgImg);
var newBgImgPath = regs[1] + "file2.png";
JSFiddle
I'll explain the RE.
It starts with a /, this will indicate it's a RE.
Then there's url\(. It matches the text url(. ( is escaped because it is a reserved character.
Then there is ['"]?. ['"] matches ' or " and the ? makes it optional.
A ( starts a RE group, that can be referred to.
In .+? . matches all charaters except a newline. A + tells that there must be at least 1 of them, or more. Finally, a ? makes the + non-greedy, so it matches as little characters as possible but still tries to match the whole RE.
A ) ends the group.
[^\/] matches any non-/ character. Then there's a + again. It has no ? after it, because we want to match as many non-/ characters (the file name) from the end as we can.
Finally, another optional quote, an escaped ) for the closing bracket in url(...) and a / to end the RE.
Now re.exec(oldBgImg) returns an array with the first element being the whole matched string and the next elements being the matched RE groups (created by () brackets). Then I can just take regs[1], which is the first matched group and contains the pathname.
You could replace the quotes in oldBgImg with nothing like this.
oldBgImg = oldBgImg.replace(/\"/g, "");
That way the URL is always the same no matter what browser retrieved it.

Javascript regular expression - matching multiple occurrences

I'm a little stuck on a problem here.
I'm trying to match multiple occurrences of a regular expression in a string, but I don't get all occurrences:
Sample:
s = new RegExp(';' + y[p][0] + '_' + y[p][1] + '_' + y[p][2] + '_([0-9]*);', 'g');
e = null;
e = s.exec(grArr);
while (e != null) {
alert(e[0]+'-'+e[1]+'-'+e[2]); //debugging output
r = r + e[0]; //adding results to output var
e = s.exec(grArr);
}
Sample variables:
//to be searched:
var grArr=';0_0_709711498101583267971121121179999105110111_11994876;0_0_709711498101583267971121121179999105110111_11994877;0_0_709711498101583267971121121179999105110111_11994878;0_0_709711498101583267971121121179999105110111_11994879;0_0_709711498101583268117110107101108103114252110_11994872;0_0_709711498101583268117110107101108103114252110_11994873;0_0_709711498101583268117110107101108103114252110_11994874;0_0_709711498101583268117110107101108103114252110_11994875;0_0_7097114981015832839910411997114122_11994868;0_0_7097114981015832839910411997114122_11994869;0_0_7097114981015832839910411997114122_11994870;0_0_7097114981015832839910411997114122_11994871;0_1_71114246115115101583276_11994870;0_1_71114246115115101583276_11994874;0_1_71114246115115101583276_11994878;0_1_71114246115115101583277_11994869;0_1_71114246115115101583277_11994873;0_1_71114246115115101583277_11994877;0_1_71114246115115101583283_11994868;0_1_71114246115115101583283_11994872;0_1_71114246115115101583283_11994876;0_1_7111424611511510158328876_11994871;0_1_7111424611511510158328876_11994875;0_1_7111424611511510158328876_11994879;'
//search Pattern:
y[0][0]='0';
y[0][1]='1';
y[0][2]='71114246115115101583283';
This results in 2 occurrences - not 3 as it should be.
The problem is that you're using the semicolon twice: Once at the start of the regex, once at the end.
Since in your example the three "matches" directly follow each other, the second occurrence is not found because its preceding semicolon has already been used in the previous match.
Solution: Use word boundaries ('\\b') instead of ';' in your regex.

Regular expression to parse jQuery-selector-like string

text = '#container a.filter(.top).filter(.bottom).filter(.middle)';
regex = /(.*?)\.filter\((.*?)\)/;
matches = text.match(regex);
log(matches);
// matches[1] is '#container a'
//matchss[2] is '.top'
I expect to capture
matches[1] is '#container a'
matches[2] is '.top'
matches[3] is '.bottom'
matches[4] is '.middle'
One solution would be to split the string into #container a and rest. Then take rest and execute recursive exec to get item inside ().
Update: I am posting a solution that does work. However I am looking for a better solution. Don't really like the idea of splitting the string and then processing
Here is a solution that works.
matches = [];
var text = '#container a.filter(.top).filter(.bottom).filter(.middle)';
var regex = /(.*?)\.filter\((.*?)\)/;
var match = regex.exec(text);
firstPart = text.substring(match.index,match[1].length);
rest = text.substring(matchLength, text.length);
matches.push(firstPart);
regex = /\.filter\((.*?)\)/g;
while ((match = regex.exec(rest)) != null) {
matches.push(match[1]);
}
log(matches);
Looking for a better solution.
This will match the single example you posted:
<html>
<body>
<script type="text/javascript">
text = '#container a.filter(.top).filter(.bottom).filter(.middle)';
matches = text.match(/^[^.]*|\.[^.)]*(?=\))/g);
document.write(matches);
</script>
</body>
</html>
which produces:
#container a,.top,.bottom,.middle
EDIT
Here's a short explanation:
^ # match the beginning of the input
[^.]* # match any character other than '.' and repeat it zero or more times
#
| # OR
#
\. # match the character '.'
[^.)]* # match any character other than '.' and ')' and repeat it zero or more times
(?= # start positive look ahead
\) # match the character ')'
) # end positive look ahead
EDIT part II
The regex looks for two types of character sequences:
one ore more characters starting from the start of the string up to the first ., the regex: ^[^.]*
or it matches a character sequence starting with a . followed by zero or more characters other than . and ), \.[^.)]*, but must have a ) ahead of it: (?=\)). This last requirement causes .filter not to match.
You have to iterate, I think.
var head, filters = [];
text.replace(/^([^.]*)(\..*)$/, function(_, h, rem) {
head = h;
rem.replace(/\.filter\(([^)]*)\)/g, function(_, f) {
filters.push(f);
});
});
console.log("head: " + head + " filters: " + filters);
The ability to use functions as the second argument to String.replace is one of my favorite things about Javascript :-)
You need to do several matches repeatedly, starting where the last match ends (see while example at https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference/Global_Objects/RegExp/exec):
If your regular expression uses the "g" flag, you can use the exec method multiple times to find successive matches in the same string. When you do so, the search starts at the substring of str specified by the regular expression's lastIndex property. For example, assume you have this script:
var myRe = /ab*/g;
var str = "abbcdefabh";
var myArray;
while ((myArray = myRe.exec(str)) != null)
{
var msg = "Found " + myArray[0] + ". ";
msg += "Next match starts at " + myRe.lastIndex;
print(msg);
}
This script displays the following text:
Found abb. Next match starts at 3
Found ab. Next match starts at 9
However, this case would be better solved using a custom-built parser. Regular expressions are not an effective solution to this problem, if you ask me.
var text = '#container a.filter(.top).filter(.bottom).filter(.middle)';
var result = text.split('.filter');
console.log(result[0]);
console.log(result[1]);
console.log(result[2]);
console.log(result[3]);
text.split() with regex does the trick.
var text = '#container a.filter(.top).filter(.bottom).filter(.middle)';
var parts = text.split(/(\.[^.()]+)/);
var matches = [parts[0]];
for (var i = 3; i < parts.length; i += 4) {
matches.push(parts[i]);
}
console.log(matches);

Categories

Resources