find regex for validating terms (keyword) input

find regex for validating terms (keyword) input - javascript

unfortunately i'm poor in regex! can you guide me to write a regex in javascript which can determine my terms input box. a user should input terms with this format:
#(all alphanumeric chars + blank + dash + quotation )
for example:
#keyword1#key word2#keyword3#key-word4#key'word5
and these inputs should be illegal:
#####
##keyword1#key2#
# #keyword
#!%^&

As you wrote a term is specified by:
/#[a-zA-Z0-9 '-]+/
Repeat that pattern, and force it to contain the start and end of the string with ^ and $.
/^(#[a-zA-Z0-9 '-]+)+$/

/#[a-zA-Z0-9][a-zA-Z0-9 '-]+/
When you said "# #keyword" should be invalid, I've assumed you mean "# " should be invalid and "#keyword" should be extracted from that string. The first 'box' means a keyword will always begin with a lowercase letter, uppercase letter, or number. If thats too restrictive and you want to allow for example "#-keyword", just add dash in before the first close-square-bracket, like so:
/#[a-zA-Z0-9-][a-zA-Z0-9 '-]+/
And to return an array of results in javascript, apply it to the string using the "global" modifier ('g' after the second slash):
arrayOfKeywords = keywordString.match(/#[a-zA-Z0-9-][a-zA-Z0-9 '-]+/g);
You may wish to see this code at my test page. Regular-expressions.info is a useful site to learn more about regular expressions. They also have an interactive page to test regexes on, which can be useful when playing around.

Related

Regex in Google Apps Script practical issue. Forms doesn't read regex as it should

I hope its just something i'm not doing right.
I've been using a simple script to create a form out of a spreadsheet. The script seems to be working fine. The output form is going to get some inputs from third parties so i can analyze them in my consulting activity.
Creating the form was not a big deal, the structure is good to go. However, after having the form creator script working, i've started working on its validations, and that's where i'm stuck at.
For text validations, i will need to use specific Regexes. Many of the inputs my clients need to give me are going to be places' and/or people's names, therefore, i should only allow them usign A-Z, single spaces, apostrophes and dashes.
My resulting regexes are:
//Regex allowing a **single name** with the first letter capitalized and the occasional use of "apostrophes" or "dashes".
const reg1stName = /^[A-Z]([a-z\'\-])+/
//Should allow (a single name/surname) like Paul, D'urso, Mac'arthur, Saint-Germaine ecc.
//Regex allowing **composite names and places names** with the first letter capitalized and the occasional use of "apostrophes" or "dashes". It must avoid double spaces, however.
const regNamesPlaces = /^[^\s]([A-Z]|[a-z]|\b[\'\- ])+[^\s]$/
//This should allow (names/surnames/places' names) like Giulius Ceasar, Joanne D'arc, Cosimo de'Medici, Cosimo de Medici, Jean-jacques Rousseau, Firenze, Friuli Venezia-giulia, L'aquila ecc.
Further in the script, these Regexes are called as validation pattern for the forms text items, in accordance with each each case.
//Validation for single names
var val1stName = FormApp.createTextValidation()
.setHelpText("Only the person First Name Here! Use only (A-Z), a single apostrophe (') or a single dash (-).")
.requireTextMatchesPattern(reg1stName)
.build();
//Validation for composite names and places names
var valNamesPlaces = FormApp.createTextValidation()
.setHelpText(("Careful with double spaces, ok? Use only (A-Z), a single apostrophe (') or a single dash (-)."))
.requireTextMatchesPattern(regNamesPlaces)
.build();
Further yet, i have a "for" loop that creates the form based on the spreadsheets fields. Up to this point, things are working just fine.
for(var i=0;i<numberRows;i++){
var questionType = data[i][0];
if (questionType==''){
continue;
}
else if(questionType=='TEXTNamesPlaces'){
form.addTextItem()
.setTitle(data[i][1])
.setHelpText(data[i][2])
.setValidation(valNamesPlaces)
.setRequired(false);
}
else if(questionType=='TEXT1stName'){
form.addTextItem()
.setTitle(data[i][1])
.setHelpText(data[i][2])
.setValidation(val1stName)
.setRequired(false);
}
The problem is when i run the script and test the resulting form.
Both validations types get imported just fine (as can be seen in the form's edit mode), but when testing it in preview mode i get an error, as if the Regex wasn't matching (sry the error message is in portuguese, i forgot to translate them as i did with the code up there):
A screenshot of the form in edit mode
A screeshot of the form in preview mode
However, if i manually remove the bars out of this regex "//" it starts working!
A screenshot of the form in edit mode, Regex without bars
A screenshot of the form in preview mode, Regex without bars
What am i doing wrong? I'm no professional dev but in my understanding, it makes no sense to write a Regex without bars.
If this is some Gforms pattern of reading regexes, i still need all of this to be read by the Apps script that creates this form after all. If i even try to pass the regex without the bars there, the script will not be able to read it.
const reg1stName = ^[A-Z]([a-z\'])+
const regNamesPlaces = ^[^\s]([A-Z]|[a-z]|\b[\'\- ])+[^\s]$
//Can't even be saved. Returns: SyntaxError: Unexpected token '^' (line 29, file "Code.gs")
Passing manually all the validations is not an option. Can anybody help me?
Thanks so much

This
/^[A-Z]([a-z\'\-])+/
will not work because the parser is trying to match your / as a string literal.
This
^[A-Z]([a-z\'\-])+
also will not work, because if the name is hyphenated, you will only match up to the hyphen. This will match the 'Some-' in 'Some-Name', for example. Also, perhaps you want a name like 'Saint John' to pass also?
I recommend the following :)
^[A-Z][a-z]*[-\.' ]?[A-Z]?[a-z]*
^ anchors to the start of the string
[A-Z] matches exactly 1 capital letter
[a-z]* matches zero or more lowercase letters (this enables you to match a name like D'Urso)
[-\.' ]? matches zero or 1 instances of - (hyphen), . (period), ' (apostrophe) or a single space (the . (period) needs to be escaped with a backslash because . is special to regex)
[A-Z]? matches zero or 1 capital letter (in case there's a second capital in the name, like D'Urso, St John, Saint-Germaine)

Problem of HTML recovery between brackets with regex

I come to you today because I encounter a small problem that has been blocking me for quite some time:
to automate a digitized management contract, I use variable home in an HTML string using a WYSIWYG editor whose form is "[[[NAME OF VARIABLE]]]". I then retrieve these variables and I have them transformed into text fields so that the user can fill them via a form.
I am currently using Angular 5.
My concern is that one of these variable has messed with the editor and gives it during the recovery in JS:
'[[[blablabla</span><span style="background-color: rgb(255, 255, 255);">]]]',
So, when assigning values submitted by the user, it is not taken into account .. What I intended to do was make a regex to replace the html code by a vacuum but it is the I block, I'm doing tested and I got to this stage:
/\[\[\[(<[^>]+>)\]\]\]/gi
but I confess that the regex and me, it's 10: D
Can anyone unlock me? If need more detail, there are no worries!
Thank you in advance !
(sorry for my english, i'm french)

You may match all substrings between [[[ and ]]] and then only replace substrings matching the <[^<>]+> pattern inside those matches:
s = s.replace(/\[\[\[[\s\S]*?]]]/g, function (m) {
return m.replace(/<[^<>]+>/g, '');
})
The first pattern matches
\[\[\[ - a [[[ substring
[\s\S]*? - any 0+ chars, as few as possible
]]] - a ]]] substring.
The g modifier finds multiple occurrences of the pattern inside the input string. The match is passed to the callback method (m is the matched text) and the .replace(/<[^<>]+>/g, '') is only applied to the m text that is returned after <, 1+ chars other than < and > and then > matches are replaced with an empty string.

Regex get all quoted words that are not also single quoted

Would it be possible to get all quoted text with a single regex?
Example text from regexr:
Edit the "Expression" & Text to see matches. Roll over "matches" or the expression for details.
Undo mistakes with ctrl-z.
Save 'Favorites & "Share" expressions' with friends or the Community. "Explore" your results with Tools. A full Reference & Help is available in the Library, or watch the video Tutorial.
In this case I would like to capture Expression, matches and Explore but not Share since 'Favorites & "Share" expressions' is single quoted.

You can't build a regex that matches only the parts you want in Javascript, however you can build a pattern that matches all the string without gaps and use a capture group to extract the part you want:
/[^"']*(?:'[^']*'[^"']*)*"([^"]*)"/g
#^----------------------^ all that isn't content between double quotes
Since your string may end with something like abcd 'efgh "ijkl" mnop' qrst (in short without the part you want but with a double quote part inside single quote substring), It's more secure to change the pattern to:
/[^"']*(?:'[^']*(?:'[^"']*|$))*(?:"([^"]*)"|$)/g
and to discard the last match.

Without special regex pattern:
var mystr = "Edit the \"Expression\" & Text to see matches. Roll over \"matches\" or the expression for details. Undo mistakes with ctrl-z. Save 'Favorites & \"Share\" expressions' with friends or the Community. \"Explore\" your results with Tools. A full Reference & Help is available in the Library, or watch the video Tutorial."
var myarr = mystr.split(/\"/g)
var opening=false;
for(var i=1; i<myarr.length;i=i+2){
if((myarr[i-1].length-myarr[i-1].replace(/'/g,"").length)%2===1){opening=!opening;}
if(!opening){console.log(myarr[1]);}
}
How works:
split text by "
odd index is string with " wrapper
if before this index, odd numbers of ' exists, this item wrapped by ' and should not be considered

RegEx match only final domain name from any email address

I want to match only parent domain name from an email address, which might or might not have a subdomain.
So far I have tried this:
new RegExp(/.+#(:?.+\..+)/);
The results:
Input: abc#subdomain.maindomain.com
Output: ["abc#subdomain.domain.com", "subdomain.maindomain.com"]
Input: abc#maindomain.com
Output: ["abc#maindomain.com", "maindomain.com"]
I am interested in the second match (the group).
My objective is that in both cases, I want the group to match and give me only maindomain.com
Note: before the down vote, please note that neither have I been able to use existing answers, nor the question matches existing ones.

One simple regex you can use to get only the last 2 parts of the domain name is
/[^.]+\.[^.]$/
It matches a sequence of non-period characters, followed by period and another sequence of non-periods, all at the end of the string. This regex doesn't ensure that this domain name happens after a "#". If you want to make a regex that also does that, you could use lazy matching with "*?":
/#.*?([^.]+\.[^.])$/
However,I think that trying to do everything at once tends to make the make regexes more complicated and hard to read. In this problem I would prefer to do things in two steps: First check that the email has an "#" in it. Then you get the part after the "#" and pass it to the simple regex, which will extract the domain name.
One advantage of separating things is that some changes are easier. For example, if you want to make sure that your email only has a single "#" in it its very easy to do in a separate step but would be tricky to achieve in the "do everything" regex.

You can use this regex:
/#(?:[^.\s]+\.)*([^.\s]+\.[^.\s]+)$/gm
Use captured group #1 for your result.
It matches # followed by 0 or more instance of non-DOT text and a DOT i.e. (?:[^.\s]+\.)*.
Using ([^.\s]+\.[^.\s]+)$ it is matching and capturing last 2 components separated by a DOT.
RegEx Demo

With the following maindomain should always return the maindomain.com bit of the string.
var pattern = new RegExp(/(?:[\.#])(\w[\w-]*\w\.\w*)$/);
var str = "abc#subdomain.maindomain.com";
var maindomain = str.match(pattern)[1];
http://codepen.io/anon/pen/RRvWkr
EDIT: tweaked to disallow domains starting with a hyphen i.e - '-yahoo.com'

Do not allow special characters except the allowed characters javascript regex

I have the following javascript code for my password strength indicator:
if (password.match(/([!,#,#,$,%])/)
{
strength += 2
}
So what this do is if the password contains one of these allowed characters (!,#,#,$,%), it will add a value to the strength of indicator.
My problem is I also want to decrease the strength of the password indicator once other special characters are present on the password. For example: ^,`,~,<,>
To remove confusion, basically I don't want any other special characters except the ones that is present above (!,#,#,$,%). So I did it hard coded, writing all special characters that I don't want.
I tried using this:
if (password.match(/([^,`,~,<,>])/)
{
strength -= 2
}
But I also don't want to include ", ' and , but then if I include them on my if condition, it will throw me an error saying syntax error on regular expression. I understand this because i know " represents a string which must be closed. Can I do something about it? Thanks in advance!

You don't need to separate your individual characters by commas, nor do you need to wrap the only term in brackets.
This should work:
/[`^~<>,"']/
note the carat (^ is not at the front, this has a special meaning when placed at the start of the [] block)
Also you should use test() because you only want a boolean if-contains result
/[`^~<>,"']/.test(password)

What you want to do is escape each of ", ', and , using a \. The regex you're looking for is:
/([\^\`\~\<\,\>\"\'])/
I actually generated that using the JSVerbalExpressions library. I highly recommend you check it out! To show you how awesome it is, the code to generate the above regex is:
var tester = VerEx()
.anyOf("^,`'\"~<>");
console.log(tester); // /([\^\`\~\<\,\>\"\'])/

Include these special characters in square brackets without commas and see if it works.
You can try it out here - http://jsfiddle.net/BCn7h/
Eg :
if (password.match(/["',]/)
{
strength -= 2
}

Develop Reference

JavaScript is the programming language of the Web.

find regex for validating terms (keyword) input - javascript

As you wrote a term is specified by: /#[a-zA-Z0-9 '-]+/ Repeat that pattern, and force it to contain the start and end of the string with ^ and $. /^(#[a-zA-Z0-9 '-]+)+$/

Related

Regex in Google Apps Script practical issue. Forms doesn't read regex as it should

Problem of HTML recovery between brackets with regex

Regex get all quoted words that are not also single quoted

RegEx match only final domain name from any email address

Do not allow special characters except the allowed characters javascript regex

Categories

Resources