Match a word unless it is preceded by an equals sign? - javascript

I have the following string
class=use><em>use</em>
that when searched using us I want to transform into
class=use><em><b>us</b>e</em>
I've tried looking at relating answers but I can't quite get it working the way I want it to. I'm especially interested in this answer's callback approach.
Help appreciated

This is a good exercise for writing regular expressions, and here's a possible solution.
"useclass=use><em>use</em>".replace(/([^=]|^)(us)/g, "$1<b>$2</b>");
// returns "<b>us</b>eclass=use><em><b>us</b>e</em>"
([^=]|^) ensures that the prefix of any matched us is either not an equal sign, or it's the start of the string.
As #jamiec pointed out in the comments, if you are using this to parse/modify HTML, just stop right now. It's mathematically impossible to parse a CFG with a regular grammar (even with enhanced JS regexps you will have a bad time trying to achieve that.)

If you can make any assumptions about the structure of your document, you may be better off using an approach that operates on DOM elements directly rather than parsing the whole document with a regex.
Parsing HTML with a regex has certain problems that can be painful to deal with.
var element = document.querySelector('em');
element.innerHTML = element.innerHTML.replace('us', '<b>us</b>');
<div class=use><em>use</em>
</div>

I would first look for any character other than the equals sign [^=] and separate it by parentheses so that I can use it again in my replacement. Then another set of parentheses around the two characters us ought to do it:
var re = /([^=]|^)(us)/
That will give you two capture groups to work with (inside the parentheses), which you can represent with $1 and $2 in your replacement string.
str.replace( /([^=|^])(us)/, '$1<b>$2</b>' );

Related

How do i allow only one (dash or dot or underscore) in a user form input using regular expression in javascript?

I'm trying to implement a username form validation in javascript where the username
can't start with numbers
can't have whitespaces
can't have any symbols but only One dot or One underscore or One dash
example of a valid username: the_user-one.123
example of invalid username: 1----- user
i've been trying to implement this for awhile but i couldn't figure out how to have only one of each allowed symbol:-
const usernameValidation = /(?=^[\w.-]+$)^\D/g
console.log(usernameValidation.test('1username')) //false
console.log(usernameValidation.test('username-One')) //true
How about using a negative lookahead at the start:
^(?!\d|.*?([_.-]).*\1)[\w.-]+$
This will check if the string
neither starts with digit
nor contains two [_.-] by use of capture and backreference
See this demo at regex101 (more explanation on the right side)
Preface: Due to my severe carelessness, I assumed the context was usage of the HTML pattern attribute instead of JavaScript input validation. I leave this answer here for posterity in case anyone really wants to do this with regex.
Although regex does have functionality to represent a pattern occuring consecutively within a certain number of times (via {<lower-bound>,<upper-bound>}), I'm not aware of regex having "elegant" functionality to enforce a set of patterns each occuring within a range of number of times but in any order and with other patterns possibly in between.
Some workarounds I can think of:
Make a regex that allows for one of each permutation of ordering of special characters (note: newlines added for readability):
^(?:
(?:(?:(?:[A-Za-z][A-Za-z0-9]*\.?)|\.)[A-Za-z0-9]*-?[A-Za-z0-9]*_?)|
(?:(?:(?:[A-Za-z][A-Za-z0-9]*\.?)|\.)[A-Za-z0-9]*_?[A-Za-z0-9]*-?)|
(?:(?:(?:[A-Za-z][A-Za-z0-9]*-?)|-)[A-Za-z0-9]*\.?[A-Za-z0-9]*_?)|
(?:(?:(?:[A-Za-z][A-Za-z0-9]*-?)|-)[A-Za-z0-9]*_?[A-Za-z0-9]*\.?)|
(?:(?:(?:[A-Za-z][A-Za-z0-9]*_?)|_)[A-Za-z0-9]*\.?[A-Za-z0-9]*-?)|
(?:(?:(?:[A-Za-z][A-Za-z0-9]*_?)|_)[A-Za-z0-9]*-?[A-Za-z0-9]*\.?)
)[A-Za-z0-9]*$
Note that the above regex can be simplified if you don't want usernames to start with special characters either.
Friendly reminder to also make sure you use the HTML attributes to enforce a minimum and maximum input character length where appropriate.
If you feel that regex isn't well suited to your use-case, know that you can do custom validation logic using javascript, which gives you much more control and can be much more readable compared to regex, but may require more lines of code to implement. Seeing the regex above, I would personally seriously consider the custom javascript route.
Note: I find https://regex101.com/ very helpful in learning, writing, and testing regex. Make sure to set the "flavour" to "JavaScript" in your case.
I have to admit that Bobble bubble's solution is the better fit. Here ia a comparison of the different cases:
console.log("Comparison between mine and Bobble Bubble's solution:\n\nusername mine,BobbleBubble");
["valid-usrId1","1nvalidUsrId","An0therVal1d-One","inva-lid.userId","anot-her.one","test.-case"].forEach(u=>console.log(u.padEnd(20," "),chck(u)));
function chck(s){
return [!!s.match(/^[a-zA-Z][a-zA-Z0-9._-]*$/) && ( s.match(/[._-]/g) || []).length<2, // mine
!!s.match(/^(?!\d|.*?([_.-]).*\1)[\w.-]+$/)].join(","); // Bobble bulle
}
The differences can be seen in the last three test cases.

Get an example matched text from a regex pattern [duplicate]

Is there any way of generating random text which satisfies provided regular expression.
I am looking for a function which works like below
var reg = Some Regular Expression
var str = RandString(reg)
I have seen fairly good solutions in perl and ruby on github, but I think there are technical issues that make a complete solution impossible. For example, /[0-9]+/ has an infinite upper bound, which is not practical for selecting random numbers from.
Never seen it in JavaScript, but you could translate.
EDIT: After googling for a few seconds...
https://github.com/fent/randexp.js
if you know what the regular expression is, you can just generate random strings, then use a function that references the index of the letters and changes them as needed. Regex expressions vary widely, so it will be difficult to find one in particular that satisfies all possible regex.
Your question is pretty open so hopefully this steers you to the right solution. Get the current time (in seconds), MD5 it, check it against a REGEX, return the match.
Running Example: http://jsfiddle.net/MattLo/3gKrb/
Usage: RandString(/([A-Za-z])/ig); // expected to be a string
For JavaScript, the following modules can generate a random match to a regex:
pxeger
randexp.js
regexgen

Why would the replace with regex not work even though the regex does?

There may be a very simple answer to this, probably because of my familiarity (or possibly lack thereof) of the replace method and how it works with regex.
Let's say I have the following string: abcdefHellowxyz
I just want to strip the first six characters and the last four, to return Hello, using regex... Yes, I know there may be other ways, but I'm trying to explore the boundaries of what these methods are capable of doing...
Anyway, I've tinkered on http://regex101.com and got the following Regex worked out:
/^(.{6}).+(.{4})$/
Which seems to pass the string well and shows that abcdef is captured as group 1, and wxyz captured as group 2. But when I try to run the following:
"abcdefHellowxyz".replace(/^(.{6}).+(.{4})$/,"")
to replace those captured groups with "" I receive an empty string as my final output... Am I doing something wrong with this syntax? And if so, how does one correct it, keeping my original stance on wanting to use Regex in this manner...
Thanks so much everyone in advance...
The code below works well as you wish
"abcdefHellowxyz".replace(/^.{6}(.+).{4}$/,"$1")
I think that only use ()to capture the text you want, and in the second parameter of replace(), you can use $1 $2 ... to represent the group1 group2.
Also you can pass a function to the second parameter of replace,and transform the captured text to whatever you want in this function.
For more detail, as #Akxe recommend , you can find document on https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace.
You are replacing any substring that matches /^(.{6}).+(.{4})$/, with this line of code:
"abcdefHellowxyz".replace(/^(.{6}).+(.{4})$/,"")
The regex matches the whole string "abcdefHellowxyz"; thus, the whole string is replaced. Instead, if you are strictly stripping by the lengths of the extraneous substrings, you could simply use substring or substr.
Edit
The answer you're probably looking for is capturing the middle token, instead of the outer ones:
var str = "abcdefHellowxyz";
var matches = str.match(/^.{6}(.+).{4}$/);
str = matches[1]; // index 0 is entire match
console.log(str);

regex replace on JSON is removing an Object from Array

I'm trying to improve my understanding of Regex, but this one has me quite mystified.
I started with some text defined as:
var txt = "{\"columns\":[{\"text\":\"A\",\"value\":80},{\"text\":\"B\",\"renderer\":\"gbpFormat\",\"value\":80},{\"text\":\"C\",\"value\":80}]}";
and do a replace as follows:
txt.replace(/\"renderer\"\:(.*)(?:,)/g,"\"renderer\"\:gbpFormat\,");
which results in:
"{"columns":[{"text":"A","value":80},{"text":"B","renderer":gbpFormat,"value":80}]}"
What I expected was for the renderer attribute value to have it's quotes removed; which has happened, but also the C column is completely missing! I'd really love for someone to explain how my Regex has removed column C?
As an extra bonus, if you could explain how to remove the quotes around any value for renderer (i.e. so I don't have to hard-code the value gbpFormat in the regex) that'd be fantastic.
You are using a greedy operator while you need a lazy one. Change this:
"renderer":(.*)(?:,)
^---- add here the '?' to make it lazy
To
"renderer":(.*?)(?:,)
Working demo
Your code should be:
txt.replace(/\"renderer\"\:(.*?)(?:,)/g,"\"renderer\"\:gbpFormat\,");
If you are learning regex, take a look at this documentation to know more about greedyness. A nice extract to understand this is:
Watch Out for The Greediness!
Suppose you want to use a regex to match an HTML tag. You know that
the input will be a valid HTML file, so the regular expression does
not need to exclude any invalid use of sharp brackets. If it sits
between sharp brackets, it is an HTML tag.
Most people new to regular expressions will attempt to use <.+>. They
will be surprised when they test it on a string like This is a
first test. You might expect the regex to match and when
continuing after that match, .
But it does not. The regex will match first. Obviously not
what we wanted. The reason is that the plus is greedy. That is, the
plus causes the regex engine to repeat the preceding token as often as
possible. Only if that causes the entire regex to fail, will the regex
engine backtrack. That is, it will go back to the plus, make it give
up the last iteration, and proceed with the remainder of the regex.
Like the plus, the star and the repetition using curly braces are
greedy.
Try like this:
txt = txt.replace(/"renderer":"(.*?)"/g,'"renderer":$1');
The issue in the expression you were using was this part:
(.*)(?:,)
By default, the * quantifier is greedy by default, which means that it gobbles up as much as it can, so it will run up to the last comma in your string. The easiest solution would be to turn that in to a non-greedy quantifier, by adding a question mark after the asterisk and change that part of your expression to look like this
(.*?)(?:,)
For the solution I proposed at the top of this answer, I also removed the part matching the comma, because I think it's easier just to match everything between quotes. As for your bonus question, to replace the matched value instead of having to hardcode gbpFormat, I used a backreference ($1), which will insert the first matched group into the replacement string.
Don't manipulate JSON with regexp. It's too likely that you will break it, as you have found, and more importantly there's no need to.
In addition, once you have changed
'{"columns": [..."renderer": "gbpFormat", ...]}'
into
'{"columns": [..."renderer": gbpFormat, ...]}' // remove quotes from gbpFormat
then this is no longer valid JSON. (JSON requires that property values be numbers, quoted strings, objects, or arrays.) So you will not be able to parse it, or send it anywhere and have it interpreted correctly.
Therefore you should parse it to start with, then manipulate the resulting actual JS object:
var object = JSON.parse(txt);
object.columns.forEach(function(column) {
column.renderer = ghpFormat;
});
If you want to replace any quoted value of the renderer property with the value itself, then you could try
column.renderer = window[column.renderer];
Assuming that the value is available in the global namespace.
This question falls into the category of "I need a regexp, or I wrote one and it's not working, and I'm not really sure why it has to be a regexp, but I heard they can do all kinds of things, so that's just what I imagined I must need." People use regexps to try to do far too many complex matching, splitting, scanning, replacement, and validation tasks, including on complex languages such as HTML, or in this case JSON. There is almost always a better way.
The only time I can imagine wanting to manipulate JSON with regexps is if the JSON is broken somehow, perhaps due to a bug in server code, and it needs to be fixed up in order to be parseable.

Getting this regex expression to work in javascript

I have an html checkbox element with the following name:
type_config[selected_licenses][CC BY-NC-ND 3.0]
I would like to break this name apart as follows and returned as part of an array:
["type_config", "[selected_licenses]", "[CC BY-NC-ND 3.0]", "[selected_licenses][CC BY-NC-ND 3.0]"]
I thought I could do this by using a regular expression in javascript. Here is the expression that I am using:
matches = /([a-zA-Z0-9_]*)((\[[a-zA-Z0-9_\.\s]*\])+)*/.exec(element_name);
but this is the result I am getting in my matches variable:
["type_config[selected_licenses]", "type_config", "[selected_licenses]", "[selected_licenses]", index: 0, input: "type_config[selected_licenses][CC BY-ND 3.0]"]
I am half way there. What am I doing wrong in my regular expression? I guess I should also ask if it is possible to accomplish what I want with a regex?
Thanks.
The problem with this kind of goal is that there's no simple way to achieve this with regular expression, i.e. a simple match call. In short, even if you put a quantifier after a capturing group, the captured string will always be just one.
You'll have to rely on something more specific, like breaking the string with a repeated use of indexOf, or something like
name.split(/(?=\[)/);
Maybe you want to be sure that name is formally correct.
This is a very ugly problem. I don't know how repeatable this is, but I can do it:
Regex
^(\w+)(?<firstbracket>\[(?<secondbracket>[^]]*)\]\[(.*?)\])$
Replacement
["$1", "[$3]", "[$4]", "$2"]
Demo
http://regex101.com/r/eD9mH8

Categories

Resources