Regex match cookie value and remove hyphens - javascript

I'm trying to extract out a group of words from a larger string/cookie that are separated by hyphens. I would like to replace the hyphens with a space and set to a variable. Javascript or jQuery.
As an example, the larger string has a name and value like this within it:
facility=34222%7CConner-Department-Store;
(notice the leading "C")
So first, I need to match()/find facility=34222%7CConner-Department-Store; with regex. Then break it down to "Conner Department Store"
var cookie = document.cookie;
var facilityValue = cookie.match( REGEX ); ??

var test = "store=874635%7Csomethingelse;facility=34222%7CConner-Department-Store;store=874635%7Csomethingelse;";
var test2 = test.replace(/^(.*)facility=([^;]+)(.*)$/, function(matchedString, match1, match2, match3){
return decodeURIComponent(match2);
});
console.log( test2 );
console.log( test2.split('|')[1].replace(/[-]/g, ' ') );

If I understood it correctly, you want to make a phrase by getting all the words between hyphens and disallowing two successive Uppercase letters in a word, so I'd prefer using Regex in that case.
This is a Regex solution, that works dynamically with any cookies in the same format and extract the wanted sentence from it:
var matches = str.match(/([A-Z][a-z]+)-?/g);
console.log(matches.map(function(m) {
return m.replace('-', '');
}).join(" "));
Demo:
var str = "facility=34222%7CConner-Department-Store;";
var matches = str.match(/([A-Z][a-z]+)-?/g);
console.log(matches.map(function(m) {
return m.replace('-', '');
}).join(" "));
Explanation:
Use this Regex (/([A-Z][a-z]+)-?/g to match the words between -.
Replace any - occurence in the matched words.
Then just join these matches array with white space.

Ok,
first, you should decode this string as follows:
var str = "facility=34222%7CConner-Department-Store;"
var decoded = decodeURIComponent(str);
// decoded = "facility=34222|Conner-Department-Store;"
Then you have multiple possibilities to split up this string.
The easiest way is to use substring()
var solution1 = decoded.substring(decoded.indexOf('|') + 1, decoded.length)
// solution1 = "Conner-Department-Store;"
solution1 = solution1.replace('-', ' ');
// solution1 = "Conner Department Store;"
As you can see, substring(arg1, arg2) returns the string, starting at index arg1 and ending at index arg2. See Full Documentation here
If you want to cut the last ; just set decoded.length - 1 as arg2 in the snippet above.
decoded.substring(decoded.indexOf('|') + 1, decoded.length - 1)
//returns "Conner-Department-Store"
or all above in just one line:
decoded.substring(decoded.indexOf('|') + 1, decoded.length - 1).replace('-', ' ')
If you want still to use a regular Expression to retrieve (perhaps more) data out of the string, you could use something similar to this snippet:
var solution2 = "";
var regEx= /([A-Za-z]*)=([0-9]*)\|(\S[^:\/?#\[\]\#\;\,']*)/;
if (regEx.test(decoded)) {
solution2 = decoded.match(regEx);
/* returns
[0:"facility=34222|Conner-Department-Store",
1:"facility",
2:"34222",
3:"Conner-Department-Store",
index:0,
input:"facility=34222|Conner-Department-Store;"
length:4] */
solution2 = solution2[3].replace('-', ' ');
// "Conner Department Store"
}
I have applied some rules for the regex to work, feel free to modify them according your needs.
facility can be any Word built with alphabetical characters lower and uppercase (no other chars) at any length
= needs to be the char =
34222 can be any number but no other characters
| needs to be the char |
Conner-Department-Store can be any characters except one of the following (reserved delimiters): :/?#[]#;,'
Hope this helps :)
edit: to find only the part
facility=34222%7CConner-Department-Store; just modify the regex to
match facility= instead of ([A-z]*)=:
/(facility)=([0-9]*)\|(\S[^:\/?#\[\]\#\;\,']*)/

You can use cookies.js, a mini framework from MDN (Mozilla Developer Network).
Simply include the cookies.js file in your application, and write:
docCookies.getItem("Connor Department Store");

Related

How to replace numbers with an empty char

i need to replace phone number in string on \n new line.
My string: Jhony Jhons,jhon#gmail.com,380967574366
I tried this:
var str = 'Jhony Jhons,jhon#gmail.com,380967574366'
var regex = /[0-9]/g;
var rec = str.trim().replace(regex, '\n').split(','); //Jhony Jhons,jhon#gmail.com,
Number replace on \n but after using e-mail extra comma is in the string need to remove it.
Finally my string should look like this:
Jhony Jhons,jhon#gmail.com\n
You can try this:
var str = 'Jhony Jhons,jhon#gmail.com,380967574366';
var regex = /,[0-9]+/g;
str.replace(regex, '\n');
The snippet above may output what you want, i.e. Jhony Jhons,jhon#gmail.com\n
There's a lot of ways to that, and this is so easy, so try this simple answer:-
var str = 'Jhony Jhons,jhon#gmail.com,380967574366';
var splitted = str.split(","); //split them by comma
splitted.pop(); //removes the last element
var rec = splitted.join() + '\n'; //join them
You need a regex to select the complete phone number and also the preceding comma. Your current regex selects each digit and replaces each one with an "\n", resulting in a lot of "\n" in the result. Also the regex does not match the comma.
Use the following regex:
var str = 'Jhony Jhons,jhon#gmail.com,380967574366'
var regex = /,[0-9]+$/;
// it replaces all consecutive digits with the condition at least one digit exists (the "[0-9]+" part)
// placed at the end of the string (the "$" part)
// and also the digits must be preceded by a comma (the "," part in the beginning);
// also no need for global flag (/g) because of the $ symbol (the end of the string) which can be matched only once
var rec = str.trim().replace(regex, '\n'); //the result will be this string: Jhony Jhons,jhon#gmail.com\n
var str = "Jhony Jhons,jhon#gmail.com,380967574366";
var result = str.replace(/,\d+/g,'\\n');
console.log(result)

What's the JS RegExp for this specific string?

I have a rather isolated situation in an inventory management program where our shelf locations have a specific format, which is always Letter: Number-Letter-Number, such as Y: 1-E-4. Most of us coworkers just type in "y1e4" and are done with it, but that obviously creates issues with inconsistent formats in a database. Are JS RegExp's the ideal way to automatically detect and format these alphanumeric strings? I'm slowly wrapping my head around JavaScript's Perl syntax, but what's a simple example of formatting one of these strings?
spec: detect string format of either "W: D-W-D" or "WDWD" and return "W: D-W-D"
This function will accept any format and return undefined if it doesnt match, returns the formatted string if a match does occur.
function validateInventoryCode(input) {
var regexp = /^([a-zA-Z]+)(?:\:\s*)?(\d+)-?(\w+)-?(\d+)$/
var r = regexp.exec(input);
if(r != null) {
return `${r[1]}: ${r[2]}-${r[3]}-${r[4]}`;
}
}
var possibles = ["y1e1", "y:1e1", "Y: 1r3", "y: 32e4", "1:e3e"];
possibles.forEach(function(posssiblity) {
console.log(`input(${posssiblity}), result(${validateInventoryCode(posssiblity)})`);
})
function validateInventoryCode(input) {
var regexp = /^([a-zA-Z]+)(?:\:\s*)?(\d+)-?(\w+)-?(\d+)$/
var r = regexp.exec(input);
if (r != null) {
return `${r[1]}: ${r[2]}-${r[3]}-${r[4]}`;
}
}
I understand the question as "convert LetterNumberLetterNumber to Letter: Number-Letter-Number.
You may use
/^([a-z])(\d+)([a-z])(\d+)$/i
and replace with $1: $2-$3-$4
Details:
^ - start of string
([a-z]) - Group 1 (referenced with $1 from the replacement pattern) capturing any ASCII letter (as /i makes the pattern case-insensitive)
(\d+) - Group 2 capturing 1 or more digits
([a-z]) - Group 3, a letter
(\d+) - Group 4, a number (1 or more digits)
$ - end of string.
See the regex demo.
var re = /^([a-z])(\d+)([a-z])(\d+)$/i;
var s = 'y1e2';
var result = s.replace(re, '$1: $2-$3-$4');
console.log(result);
OR - if the letters must be turned to upper case:
var re = /^([a-z])(\d+)([a-z])(\d+)$/i;
var s = 'y1e2';
var result = s.replace(re,
(m,g1,g2,g3,g4)=>`${g1.toUpperCase()}: ${g2}-${g3.toUpperCase()}-${g4}`
);
console.log(result);
this is the function to match and replace the pattern: DEMO
function findAndFormat(text){
var splittedText=text.split(' ');
for(var i=0, textLength=splittedText.length; i<textLength; i++){
var analyzed=splittedText[i].match(/[A-z]{1}\d{1}[A-z]{1}\d{1}$/);
if(analyzed){
var formattedString=analyzed[0][0].toUpperCase()+': '+analyzed[0][1]+'-'+analyzed[0][2].toUpperCase()+'-'+analyzed[0][3];
text=text.replace(splittedText[i],formattedString);
}
}
return text;
}
i think it's just as it reads:
y1e4
Letter, number, letter, number:
/([A-z][0-9][A-z][0-9])/g
And yes, it's ok to use regex in this case, like form validations and stuff like that. it's just there are some cases on which abusing of regular expressions gives you a bad performance (into intensive data processing and the like)
Example
"HelloY1E4world".replace(/([A-z][0-9][A-z][0-9])/g, ' ');
should return: "Hello world"
regxr.com always comes in handy

Extract word between '=' and '('

I have the following string
234234=AWORDHERE('sdf.'aa')
where I need to extract AWORDHERE.
Sometimes there can be space in between.
234234= AWORDHERE('sdf.'aa')
Can I do this with a regular expression?
Or should I do it manually by finding indexes?
The datasets are huge, so it's important to do it as fast as possible.
Try this regex:
\d+=\s?(\w+)\(
Check Demo
in Javascript it would like that:
var myString = "234234=AWORDHERE('sdf.'aa')";// or 234234= AWORDHERE('sdf.'aa')
var myRegexp = /\d+=\s?(\w+)\(/g;
var match = myRegexp.exec(myString);
console.log(match[1]); // AWORDHERE
You could do this at least three ways. You need to benchmark to see what's fastest.
Substring w/ indexes
function extract(from) {
var ixEq = from.indexOf("=");
var ixParen = from.indexOf("(");
return from.substring(ixEq + 1, ixParen);
}
.
Splits
function extract(from) {
var spEq = from.split("=");
var spParen = spEq[1].split("(");
return spParen[0];
}
Regex (demo)
Here is some sample regex you could use
/[^=]+=([^(]+).*/g
This says
[^=]+ - One or more character which is not an =
= - The = itself
( - creates a matching group so you can access your match in code
[^(]+ - One or more character which is not a (
) - closes the matching group
.* - Matches the rest of the line
the /g on the end tells it to perform the match on all lines.
Using look around you can search for string preceded by = and followed by ( as following.
Regex: (?<==)[A-Z ]+(?=\()
Explanation:
(?<==) checks if [A-Z ] is preceded by an =.
[A-Z ]+ matches your pattern.
(?=\() checks if matched pattern is followed by a (.
Regex101 Demo
var str = "234234= AWORDHERE('sdf.'aa')";
var regexp = /.*=\s+(\w+)\(.*\)/g;
var match = regexp.exec(str);
alert( match[1] );
I made my solution for this just a little more general than you asked for, but I don't think it takes much more time to execute. I didn't measure. If you need greater efficiency than this provides, comment and I or someone else can help you with that.
Here's what I did, using the command prompt of node:
> var s = "234234= AWORDHERE('sdf.'aa')"
undefined
> var a = s.match(/(\w+)=\s*(\w+)\s*\(.*/)
undefined
> a
[ '234234= AWORDHERE(\'sdf.\'aa\')',
'234234',
'AWORDHERE',
index: 0,
input: '234234= AWORDHERE(\'sdf.\'aa\')' ]
>
As you can see, this matches the number before the = in a[1], and it matches the AWORDHERE name as you requested in a[2]. This will work with any number (including zero) spaces before and/or after the =.

Extract string when preceding number or combo of preceding characters is unknown

Here's an example string:
++++#foo+bar+baz++#yikes
I need to extract foo and only foo from there or a similar scenario.
The + and the # are the only characters I need to worry about.
However, regardless of what precedes foo, it needs to be stripped or ignored. Everything else after it needs to as well.
try this:
/\++#(\w+)/
and catch the capturing group one.
You can simply use the match() method.
var str = "++++#foo+bar+baz++#yikes";
var res = str.match(/\w+/g);
console.log(res[0]); // foo
console.log(res); // foo,bar,baz,yikes
Or use exec
var str = "++++#foo+bar+baz++#yikes";
var match = /(\w+)/.exec(str);
alert(match[1]); // foo
Using exec with a g modifier (global) is meant to be used in a loop getting all sub matches.
var str = "++++#foo+bar+baz++#yikes";
var re = /\w+/g;
var match;
while (match = re.exec(str)) {
// In array form, match is now your next match..
}
How exactly do + and # play a role in identifying foo? If you just want any string that follows # and is terminated by + that's as simple as:
var foostring = '++++#foo+bar+baz++#yikes';
var matches = (/\#([^+]+)\+/g).exec(foostring);
if (matches.length > 1) {
// all the matches are found in elements 1 .. length - 1 of the matches array
alert('found ' + matches[1] + '!'); // alerts 'found foo!'
}
To help you more specifically, please provide information about the possible variations of your data and how you would go about identifying the token you want to extract even in cases of differing lengths and characters.
If you are just looking for the first segment of text preceded and followed by any combination of + and #, then use:
var foostring = '++++#foo+bar+baz++#yikes';
var result = foostring.match(/[^+#]+/);
// will be the single-element array, ['foo'], or null.
Depending on your data, using \w may be too restrictive as it is equivalent to [a-zA-z0-9_]. Does your data have anything else such as punctuation, dashes, parentheses, or any other characters that you do want to include in the match? Using the negated character class I suggest will catch every token that does not contain a + or a #.

How to match one, but not two characters using regular expressions

Using javascript regular expressions, how do you match one character while ignoring any other characters that also match?
Example 1: I want to match $, but not $$ or $$$.
Example 2: I want to match $$, but not $$$.
A typical string that is being tested is, "$ $$ $$$ asian italian"
From a user experience perspective, the user selects, or deselects, a checkbox whose value matches tags found in in a list of items. All the tags must be matched (checked) for the item to show.
function filterResults(){
// Make an array of the checked inputs
var aInputs = $('.listings-inputs input:checked').toArray();
// alert(aInputs);
// Turn that array into a new array made from each items value.
var aValues = $.map(aInputs, function(i){
// alert($(i).val());
return $(i).val();
});
// alert(aValues);
// Create new variable, set the value to the joined array set to lower case.
// Use this variable as the string to test
var sValues = aValues.join(' ').toLowerCase();
// alert(sValues);
// sValues = sValues.replace(/\$/ig,'\\$');
// alert(sValues);
// this examines each the '.tags' of each item
$('.listings .tags').each(function(){
var sTags = $(this).text();
// alert(sTags);
sSplitTags = sTags.split(' \267 '); // JavaScript uses octal encoding for special characters
// alert(sSplitTags);
// sSplitTags = sTags.split(' \u00B7 '); // This also works
var show = true;
$.each(sSplitTags, function(i,tag){
if(tag.charAt(0) == '$'){
// alert(tag);
// alert('It begins with a $');
// You have to escape special characters for the RegEx
tag = tag.replace(/\$/ig,'\\$');
// alert(tag);
}
tag = '\\b' + tag + '\\b';
var re = new RegExp(tag,'i');
if(!(re.test(sValues))){
alert(tag);
show = false;
alert('no match');
return false;
}
else{
alert(tag);
show = true;
alert('match');
}
});
if(show == false){
$(this).parent().hide();
}
else{
$(this).parent().show();
}
});
// call the swizzleRows function in the listings.js
swizzleList();
}
Thanks in advance!
Normally, with regex, you can use (?<!x)x(?!x) to match an x that is not preceded nor followed with x.
With the modern ECMAScript 2018+ compliant JS engines, you may use lookbehind based regex:
(?<!\$)\$(?!\$)
See the JS demo (run it in supported browsers only, their number is growing, check the list here):
const str ="$ $$ $$$ asian italian";
const regex = /(?<!\$)\$(?!\$)/g;
console.log( str.match(regex).length ); // Count the single $ occurrences
console.log( str.replace(regex, '<span>$&</span>') ); // Enclose single $ occurrences with tags
console.log( str.split(regex) ); // Split with single $ occurrences
\bx\b
Explanation: Matches x between two word boundaries (for more on word boundaries, look at this tutorial). \b includes the start or end of the string.
I'm taking advantage of the space delimiting in your question. If that is not there, then you will need a more complex expression like (^x$|^x[^x]|[^x]x[^x]|[^x]x$) to match different positions possibly at the start and/or end of the string. This would limit it to single character matching, whereas the first pattern matches entire tokens.
The alternative is just to tokenize the string (split it at spaces) and construct an object from the tokens which you can just look up to see if a given string matched one of the tokens. This should be much faster per-lookup than regex.
Something like that:
q=re.match(r"""(x{2})($|[^x])""", 'xx')
q.groups() ('xx', '')
q=re.match(r"""(x{2})($|[^x])""", 'xxx')
q is None True

Categories

Resources