Regexp for finding all regexps in project - javascript

I need to optimize all regexps in a JavaScript project. I found all the ones created with new RegExp with a simple search. The problem are the ones created as literals:/asd/.
I am using PhPStorm so the regexp engine is Java. That means we have look behind. So i came up with this:
(?<=[\s=(,\[\?:;|)])\/[^*\n/][^\n/]*[^*]\/
This translates in give me everything that looks like /.../ and is not preceded by one of the following:\s= (,[?:;|).
Can a regexp be preceded by anything else?
Do you have a better idea?
Searching for methods used by String and RegExp classes is not acceptable(exec, replace...) because finding the declaration in some projects is very hard and requires a lot of time. Plus you can have multiple uses of the same regexp.

My regexp was a bit off. I used this eventually:
(?<=[\s=(,\[\?:;|)])\/[^\n/].*?\/

Related

RegExp & PCRE convert to tree with own syntax

Looking for pre-processor for creating own syntax of regular expression, based on RegExp & PCRE syntax so it can be parsed to PCRE syntax. Example at the end
I guess I need a processor of regular expression that outputs a tree structure that represents regular expression, so I can traverse the tree and hotswap some parts, then compile it to regular expression string.
But this processor must have ability to add own syntax parsing/processing.
Is there some processor like this, already made by someone? I've made one by myself some time ago, but looking for more professional solution.
Of course we are talking about node.js/javascript
Yes, node.js has not support for PCRE, but there is a npm module for using PCRE with node.js, it works great!
Why someone would need it?
For example, you can create big regular expression by smaller ones:
(John (like|love)s every (animal|creature) on earth: (#animals))
(#...) is hash tag group, it means in place of it will be another regular expression containing alterantives for all animals.
Another example, you can create more sophisticated kind of groups:
(#(a|x)(b)(c))
permutation group matches all brackets (3 or less or more) in any order:
(a|x)(b)(c)
(a|x)(c)(b)
(b)(a|x)(c)
(b)(c)(a|x)
(c)(a|x)(b)
(c)(b)(a|x)
have more, but I guess I've made a point.

What are the differences between javascript and PCRE regular expressions? [duplicate]

I'm just a noob when it comes to regexp. I know Perl is amazing with regexp and I don't know much Perl. Recently started learning JavaScript and came across regex for
validating user inputs... haven't used them much.
How does JavaScript regexp compare with Perl regexp? Similarities and differences?
Can all regexp(s) written in JS be used in Perl and vice-versa?
Similar syntax?
From ECMAScript 2018 onwards, many of JavaScript's regex deficiencies have been fixed.
It now supports lookbehind assertions, even unbounded ones.
Unicode property escapes have been added.
There finally is a DOTALL (/s) flag.
What is still missing:
JavaScript doesn't have a way to prevent backtracking by making matches final (using possessive quantifiers ++/*+/?+ or atomic groups (?>...)).
Recursive/balanced subgroup matching is not supported.
One other (cosmetic) thing is that JavaScript doesn't know verbose regexes, which might make them harder to read.
Other than that, the basic regex syntax is very similar in both flavors.
This comparison will answer all your queries.
Another difference: In JavaScript, there is no s modifier: The dot "." will never match a newline character. As a replacement for ".", the character class [\s\S] can be used in JavaScript, which will work like /./s in Perl.
I just ran into an instance where the \d, decimal is not recognized in some versions of JavaScript -- you have to use [0-9].

JavaScript equivalent of C#'s Char.IsSymbol

I'm trying to strip all 'Unicode Symbols' from a string. That is, keeping all multilingual characters but removing dingbats, arrows, and all of that stuff.
C# has a very handy function called Char.IsSymbol that can be run on all characters of a string, stripping the character when the functions returns true.
I've been searching on doing something similar in JavaScript. If it's a regex then how can I compile a list of all the unicode ranges of the symbol characters? I looked at XRegExp but couldn't find something that only filters symbols.
XRegExp does have support for what you're looking for - http://xregexp.com/plugins/#unicode
You'd probably match either for \pL or \pS. You can find a nice list of the typical unicode categories in http://www.regular-expressions.info/unicode.html#category
Overall, Unicode is quite tricky. It gives plenty of opportunities for giving you trouble, especially with software that isn't fully Unicode compatible (sadly, this includes JavaScript - see https://mathiasbynens.be/notes/javascript-unicode for a nice set of example). This is further exacerbated by the fact that JS often runs with double-encoding (HTML+JS, and there's worse cases as well). Somebody will probably find a way to bypass your checks, but I'm afraid there's no easy way to prevent that. Just be on the lookout :)

Java Regex replace function not working as intended

I need some help with a JS Regex.
Here's the string I'm passing, I want to delete everything before 'Hanyuu-sama' with JS Replace.
Hanyuu","dj":{"id":18,"djname":"Hanyuu-sama
The first and second "Hanyuu" can change, the id number can change. This has already been cropped quite a bit with regular expressions.
Now I've tried a few and surprisingly it's failing when I do simple and complex regexes:
I've tried:
.*\"
And it does nothing, I've tried disgusting stuff in my desperation:
.*\","dj\":{\"id":.*,\"djname\":\"
And nada.
Here's a JS Fiddle and here's a http://regex101.com/r/tE2uY0/1 Regex JS matching platform.
Does anyone know why this isn't working?
I know this is likely bad practice, I'm just trying to learn Regexes.
Bonus points if anyone can refer me to a good source to learn Regular expressions. I'd love a solution but I'd like to learn how to do this myself in the future and why this one failed even more.
Your method call should look like this:
source = source.replace(/.*"/, "");
Regular expression in javascript are written between /.../ and not "/.../" like they are in many other languages.
If your string is always structured like that and it does not contain any more characters, your regex should do the trick. That's because the * quantifier acts greedy by default, thus always matching the last " in the string.

Javascript regex compared to Perl regex

I'm just a noob when it comes to regexp. I know Perl is amazing with regexp and I don't know much Perl. Recently started learning JavaScript and came across regex for
validating user inputs... haven't used them much.
How does JavaScript regexp compare with Perl regexp? Similarities and differences?
Can all regexp(s) written in JS be used in Perl and vice-versa?
Similar syntax?
From ECMAScript 2018 onwards, many of JavaScript's regex deficiencies have been fixed.
It now supports lookbehind assertions, even unbounded ones.
Unicode property escapes have been added.
There finally is a DOTALL (/s) flag.
What is still missing:
JavaScript doesn't have a way to prevent backtracking by making matches final (using possessive quantifiers ++/*+/?+ or atomic groups (?>...)).
Recursive/balanced subgroup matching is not supported.
One other (cosmetic) thing is that JavaScript doesn't know verbose regexes, which might make them harder to read.
Other than that, the basic regex syntax is very similar in both flavors.
This comparison will answer all your queries.
Another difference: In JavaScript, there is no s modifier: The dot "." will never match a newline character. As a replacement for ".", the character class [\s\S] can be used in JavaScript, which will work like /./s in Perl.
I just ran into an instance where the \d, decimal is not recognized in some versions of JavaScript -- you have to use [0-9].

Categories

Resources