How to match one, but not two characters using regular expressions - javascript

Using javascript regular expressions, how do you match one character while ignoring any other characters that also match?
Example 1: I want to match $, but not $$ or $$$.
Example 2: I want to match $$, but not $$$.
A typical string that is being tested is, "$ $$ $$$ asian italian"
From a user experience perspective, the user selects, or deselects, a checkbox whose value matches tags found in in a list of items. All the tags must be matched (checked) for the item to show.
function filterResults(){
// Make an array of the checked inputs
var aInputs = $('.listings-inputs input:checked').toArray();
// alert(aInputs);
// Turn that array into a new array made from each items value.
var aValues = $.map(aInputs, function(i){
// alert($(i).val());
return $(i).val();
});
// alert(aValues);
// Create new variable, set the value to the joined array set to lower case.
// Use this variable as the string to test
var sValues = aValues.join(' ').toLowerCase();
// alert(sValues);
// sValues = sValues.replace(/\$/ig,'\\$');
// alert(sValues);
// this examines each the '.tags' of each item
$('.listings .tags').each(function(){
var sTags = $(this).text();
// alert(sTags);
sSplitTags = sTags.split(' \267 '); // JavaScript uses octal encoding for special characters
// alert(sSplitTags);
// sSplitTags = sTags.split(' \u00B7 '); // This also works
var show = true;
$.each(sSplitTags, function(i,tag){
if(tag.charAt(0) == '$'){
// alert(tag);
// alert('It begins with a $');
// You have to escape special characters for the RegEx
tag = tag.replace(/\$/ig,'\\$');
// alert(tag);
}
tag = '\\b' + tag + '\\b';
var re = new RegExp(tag,'i');
if(!(re.test(sValues))){
alert(tag);
show = false;
alert('no match');
return false;
}
else{
alert(tag);
show = true;
alert('match');
}
});
if(show == false){
$(this).parent().hide();
}
else{
$(this).parent().show();
}
});
// call the swizzleRows function in the listings.js
swizzleList();
}
Thanks in advance!

Normally, with regex, you can use (?<!x)x(?!x) to match an x that is not preceded nor followed with x.
With the modern ECMAScript 2018+ compliant JS engines, you may use lookbehind based regex:
(?<!\$)\$(?!\$)
See the JS demo (run it in supported browsers only, their number is growing, check the list here):
const str ="$ $$ $$$ asian italian";
const regex = /(?<!\$)\$(?!\$)/g;
console.log( str.match(regex).length ); // Count the single $ occurrences
console.log( str.replace(regex, '<span>$&</span>') ); // Enclose single $ occurrences with tags
console.log( str.split(regex) ); // Split with single $ occurrences

\bx\b
Explanation: Matches x between two word boundaries (for more on word boundaries, look at this tutorial). \b includes the start or end of the string.
I'm taking advantage of the space delimiting in your question. If that is not there, then you will need a more complex expression like (^x$|^x[^x]|[^x]x[^x]|[^x]x$) to match different positions possibly at the start and/or end of the string. This would limit it to single character matching, whereas the first pattern matches entire tokens.
The alternative is just to tokenize the string (split it at spaces) and construct an object from the tokens which you can just look up to see if a given string matched one of the tokens. This should be much faster per-lookup than regex.

Something like that:
q=re.match(r"""(x{2})($|[^x])""", 'xx')
q.groups() ('xx', '')
q=re.match(r"""(x{2})($|[^x])""", 'xxx')
q is None True

Related

Javascript remove all characters by regex rules

Who can help me with the following
I create a rule with regex and I want remove all characters from the string if they not allowed.
I tried something by myself but I get not the result that I want
document.getElementById('item_price').onkeydown = function() {
var regex = /^(\d+[,]+\d{2})$/;
if (regex.test(this.value) == false ) {
this.value = this.value.replace(regex, "");
}
}
The characters that allowed are numbers and one komma.
Remove all letters, special characters and double kommas.
If the user types k12.40 the code must replace this string to 1240
Who can help me to the right direction?
This completely removes double occurrences of commas using regex, but keeps single ones.
// This should end up as 1,23243,09
let test = 'k1,23.2,,43d,0.9';
let replaced = test.replace(/([^(\d|,)]|,{2})/g, '')
console.log(replaced);
I don't believe there's an easy way to have a single Regex behave like you want. You can use a function to determine what to replace each character with, though:
// This should end up as 1232,4309 - allows one comma and any digits
let test = 'k12,3.2,,43,d0.9';
let foundComma = false;
let replaced = test.replace(/(,,)|[^\d]/g, function (item) {
if (item === ',' && !foundComma) {
foundComma = true;
return ',';
} else {
return '';
}
})
console.log(replaced);
This will loop through each non-digit. If its the first time a comma has appeared in this string, it will leave it. Otherwise, if it must be either another comma or a non-digit, and it will be replaced. It will also replace any double commas with nothing, even if it is the first set of commas - if you want it to be replaced with a single comma, you can remove the (,,) from the regex.

Regex match cookie value and remove hyphens

I'm trying to extract out a group of words from a larger string/cookie that are separated by hyphens. I would like to replace the hyphens with a space and set to a variable. Javascript or jQuery.
As an example, the larger string has a name and value like this within it:
facility=34222%7CConner-Department-Store;
(notice the leading "C")
So first, I need to match()/find facility=34222%7CConner-Department-Store; with regex. Then break it down to "Conner Department Store"
var cookie = document.cookie;
var facilityValue = cookie.match( REGEX ); ??
var test = "store=874635%7Csomethingelse;facility=34222%7CConner-Department-Store;store=874635%7Csomethingelse;";
var test2 = test.replace(/^(.*)facility=([^;]+)(.*)$/, function(matchedString, match1, match2, match3){
return decodeURIComponent(match2);
});
console.log( test2 );
console.log( test2.split('|')[1].replace(/[-]/g, ' ') );
If I understood it correctly, you want to make a phrase by getting all the words between hyphens and disallowing two successive Uppercase letters in a word, so I'd prefer using Regex in that case.
This is a Regex solution, that works dynamically with any cookies in the same format and extract the wanted sentence from it:
var matches = str.match(/([A-Z][a-z]+)-?/g);
console.log(matches.map(function(m) {
return m.replace('-', '');
}).join(" "));
Demo:
var str = "facility=34222%7CConner-Department-Store;";
var matches = str.match(/([A-Z][a-z]+)-?/g);
console.log(matches.map(function(m) {
return m.replace('-', '');
}).join(" "));
Explanation:
Use this Regex (/([A-Z][a-z]+)-?/g to match the words between -.
Replace any - occurence in the matched words.
Then just join these matches array with white space.
Ok,
first, you should decode this string as follows:
var str = "facility=34222%7CConner-Department-Store;"
var decoded = decodeURIComponent(str);
// decoded = "facility=34222|Conner-Department-Store;"
Then you have multiple possibilities to split up this string.
The easiest way is to use substring()
var solution1 = decoded.substring(decoded.indexOf('|') + 1, decoded.length)
// solution1 = "Conner-Department-Store;"
solution1 = solution1.replace('-', ' ');
// solution1 = "Conner Department Store;"
As you can see, substring(arg1, arg2) returns the string, starting at index arg1 and ending at index arg2. See Full Documentation here
If you want to cut the last ; just set decoded.length - 1 as arg2 in the snippet above.
decoded.substring(decoded.indexOf('|') + 1, decoded.length - 1)
//returns "Conner-Department-Store"
or all above in just one line:
decoded.substring(decoded.indexOf('|') + 1, decoded.length - 1).replace('-', ' ')
If you want still to use a regular Expression to retrieve (perhaps more) data out of the string, you could use something similar to this snippet:
var solution2 = "";
var regEx= /([A-Za-z]*)=([0-9]*)\|(\S[^:\/?#\[\]\#\;\,']*)/;
if (regEx.test(decoded)) {
solution2 = decoded.match(regEx);
/* returns
[0:"facility=34222|Conner-Department-Store",
1:"facility",
2:"34222",
3:"Conner-Department-Store",
index:0,
input:"facility=34222|Conner-Department-Store;"
length:4] */
solution2 = solution2[3].replace('-', ' ');
// "Conner Department Store"
}
I have applied some rules for the regex to work, feel free to modify them according your needs.
facility can be any Word built with alphabetical characters lower and uppercase (no other chars) at any length
= needs to be the char =
34222 can be any number but no other characters
| needs to be the char |
Conner-Department-Store can be any characters except one of the following (reserved delimiters): :/?#[]#;,'
Hope this helps :)
edit: to find only the part
facility=34222%7CConner-Department-Store; just modify the regex to
match facility= instead of ([A-z]*)=:
/(facility)=([0-9]*)\|(\S[^:\/?#\[\]\#\;\,']*)/
You can use cookies.js, a mini framework from MDN (Mozilla Developer Network).
Simply include the cookies.js file in your application, and write:
docCookies.getItem("Connor Department Store");

What's the JS RegExp for this specific string?

I have a rather isolated situation in an inventory management program where our shelf locations have a specific format, which is always Letter: Number-Letter-Number, such as Y: 1-E-4. Most of us coworkers just type in "y1e4" and are done with it, but that obviously creates issues with inconsistent formats in a database. Are JS RegExp's the ideal way to automatically detect and format these alphanumeric strings? I'm slowly wrapping my head around JavaScript's Perl syntax, but what's a simple example of formatting one of these strings?
spec: detect string format of either "W: D-W-D" or "WDWD" and return "W: D-W-D"
This function will accept any format and return undefined if it doesnt match, returns the formatted string if a match does occur.
function validateInventoryCode(input) {
var regexp = /^([a-zA-Z]+)(?:\:\s*)?(\d+)-?(\w+)-?(\d+)$/
var r = regexp.exec(input);
if(r != null) {
return `${r[1]}: ${r[2]}-${r[3]}-${r[4]}`;
}
}
var possibles = ["y1e1", "y:1e1", "Y: 1r3", "y: 32e4", "1:e3e"];
possibles.forEach(function(posssiblity) {
console.log(`input(${posssiblity}), result(${validateInventoryCode(posssiblity)})`);
})
function validateInventoryCode(input) {
var regexp = /^([a-zA-Z]+)(?:\:\s*)?(\d+)-?(\w+)-?(\d+)$/
var r = regexp.exec(input);
if (r != null) {
return `${r[1]}: ${r[2]}-${r[3]}-${r[4]}`;
}
}
I understand the question as "convert LetterNumberLetterNumber to Letter: Number-Letter-Number.
You may use
/^([a-z])(\d+)([a-z])(\d+)$/i
and replace with $1: $2-$3-$4
Details:
^ - start of string
([a-z]) - Group 1 (referenced with $1 from the replacement pattern) capturing any ASCII letter (as /i makes the pattern case-insensitive)
(\d+) - Group 2 capturing 1 or more digits
([a-z]) - Group 3, a letter
(\d+) - Group 4, a number (1 or more digits)
$ - end of string.
See the regex demo.
var re = /^([a-z])(\d+)([a-z])(\d+)$/i;
var s = 'y1e2';
var result = s.replace(re, '$1: $2-$3-$4');
console.log(result);
OR - if the letters must be turned to upper case:
var re = /^([a-z])(\d+)([a-z])(\d+)$/i;
var s = 'y1e2';
var result = s.replace(re,
(m,g1,g2,g3,g4)=>`${g1.toUpperCase()}: ${g2}-${g3.toUpperCase()}-${g4}`
);
console.log(result);
this is the function to match and replace the pattern: DEMO
function findAndFormat(text){
var splittedText=text.split(' ');
for(var i=0, textLength=splittedText.length; i<textLength; i++){
var analyzed=splittedText[i].match(/[A-z]{1}\d{1}[A-z]{1}\d{1}$/);
if(analyzed){
var formattedString=analyzed[0][0].toUpperCase()+': '+analyzed[0][1]+'-'+analyzed[0][2].toUpperCase()+'-'+analyzed[0][3];
text=text.replace(splittedText[i],formattedString);
}
}
return text;
}
i think it's just as it reads:
y1e4
Letter, number, letter, number:
/([A-z][0-9][A-z][0-9])/g
And yes, it's ok to use regex in this case, like form validations and stuff like that. it's just there are some cases on which abusing of regular expressions gives you a bad performance (into intensive data processing and the like)
Example
"HelloY1E4world".replace(/([A-z][0-9][A-z][0-9])/g, ' ');
should return: "Hello world"
regxr.com always comes in handy

Extract string when preceding number or combo of preceding characters is unknown

Here's an example string:
++++#foo+bar+baz++#yikes
I need to extract foo and only foo from there or a similar scenario.
The + and the # are the only characters I need to worry about.
However, regardless of what precedes foo, it needs to be stripped or ignored. Everything else after it needs to as well.
try this:
/\++#(\w+)/
and catch the capturing group one.
You can simply use the match() method.
var str = "++++#foo+bar+baz++#yikes";
var res = str.match(/\w+/g);
console.log(res[0]); // foo
console.log(res); // foo,bar,baz,yikes
Or use exec
var str = "++++#foo+bar+baz++#yikes";
var match = /(\w+)/.exec(str);
alert(match[1]); // foo
Using exec with a g modifier (global) is meant to be used in a loop getting all sub matches.
var str = "++++#foo+bar+baz++#yikes";
var re = /\w+/g;
var match;
while (match = re.exec(str)) {
// In array form, match is now your next match..
}
How exactly do + and # play a role in identifying foo? If you just want any string that follows # and is terminated by + that's as simple as:
var foostring = '++++#foo+bar+baz++#yikes';
var matches = (/\#([^+]+)\+/g).exec(foostring);
if (matches.length > 1) {
// all the matches are found in elements 1 .. length - 1 of the matches array
alert('found ' + matches[1] + '!'); // alerts 'found foo!'
}
To help you more specifically, please provide information about the possible variations of your data and how you would go about identifying the token you want to extract even in cases of differing lengths and characters.
If you are just looking for the first segment of text preceded and followed by any combination of + and #, then use:
var foostring = '++++#foo+bar+baz++#yikes';
var result = foostring.match(/[^+#]+/);
// will be the single-element array, ['foo'], or null.
Depending on your data, using \w may be too restrictive as it is equivalent to [a-zA-z0-9_]. Does your data have anything else such as punctuation, dashes, parentheses, or any other characters that you do want to include in the match? Using the negated character class I suggest will catch every token that does not contain a + or a #.

How to match with javascript and regex?

I have the following HTML:
<span id="UnitCost5">$3,079.95 to $3,479.95</span>
And i want to use Javascript and Regex to get all number matches.
So i want my script function to return: 3,079.95 AND 3,479.95
Note the text may be different so i need the solution as generic as posible, may be it will be like this:
<span id="UnitCost5">$3,079.95 And Price $3,479.95</span>
All the numbers would be matched by:
\.?\d[\d.,]*
This assumes the numbers you look for can start with a decimal dot. If they cannot, this would work (and maybe produce less false positives):
\d[\d.,]*
Be aware that different local customs exist in number formatting.
I assume that you use appropriate means to get hold of the text value of the HTML nodes you wish to process, and that HTML parsing is not part of the excercise.
You don't want to capture all numbers, otherwise you would get the 5 in the id, too. I would guess, what you're looking for is numbers looking like this: $#,###.##
Here goes the expression for that:
/\$[0-9]{1,3}(,[0-9]{3})*(\.[0-9]+)?/
\$ The dollar sign
[0-9]{1,3} One to three digits
(,[0-9]{3})* [Optional]: Digit triplets, preceded by a comma
(\.[0-9]+)? [Optional]: Even more digits, preceded by a period
/(?:\d{1,3},)*\d{1,3}(?:\.\d+)?/g;
Let's break that into parts for explanations:
(?:\d{1,3},)* - Match any numbers separated by a thousand-divider
\d{1,3} - Match the numbers before the decimal point
(?:.\d+) - Match an arbitrary number of decimals
Flag 'g' - Make a global search to find all matches in the string
You can use it like this:
var regex = /(?:\d{1,3},)*\d{1,3}(?:\.\d+)?/g;
var numbers = "$3,079.95 And Price $3,479.95".match(regex);
// numbers[0] = 3,079.95
// numbers[1] = 3,479.95
A very simple solution is the following one. Note that it will also match some invalid number strings like $..44,.777.
\$[0-9,.]+
(function () {
var reg = /\$([\d\.,]+)\s[\w\W]+\s\$([\d\.,]+)$/;
// this function used to clean inner html
function trim(str) {
var str = str.replace(/^\s\s*/, ''),
ws = /\s/,
i = str.length;
while (ws.test(str.charAt(--i)));
return str.slice(0, i + 1);
}
function getNumbersFromElement(elementId) {
var el = document.getElementById(elementId),
text = trim(el.innerHTML),
firstPrice,
secondPrice,
result;
result = reg.exec(text);
if (result[1] && result[2]) {
// inside this block we have valid prices
firstPrice = result[1];
secondPrice = result[2];
// do whatever you need
return firstPrice + ' AND ' + secondPrice;
} else {
return null; // otherwise
}
}
// usage:
getNumbersFromElement('UnitCost5');
})();
The following will return an array of all prices found in the string
function getPrices(str) {
var reg = /\$([\d,.]+)/g;
var prices =[];
var price;
while((price = reg.exec(str))!=null) {
prices.push(price);
}
return prices;
}
edit: note that the regex itself may return some false positives

Categories

Resources