javascript regular expression - getting value after colon, without the colon - javascript

I have tried the https://www.regex101.com/#javascript tool, as well as a similar stackoverflow question and yet haven't been able to solve/understand this. Hopefully someone here can explain what I am doing wrong. I have created as detailed, step-by-step of an example as I can.
My goal is to be able to parse custom attributes, so for example:
I wrote some jquery code to pull in the attribute and the value, and then wanted to run regex against the result.
Below is the html/js, the output screenshot, and the regular expression screenshot, which says my regex query should match what I am expecting.
Expected result: 'valOne'
Result: ':valOne' <-- why am I getting a ':' character?
<html>
<head>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js"></script>
<script>
$(document).ready(function() {
$('[customAttr]').each(function(){
var attrValues = $(this).attr('customAttr');
var regEx_attrVal = /[\w:]+?(?=;|$)/g;
var regEx_preColon = /[\w]+?(?=:)/g;
var regEx_postColon = /:(\w*)+?(?=;|\b)/g;
var customAttrVal = attrValues.match(regEx_attrVal);
var customAttrVal_string = customAttrVal.toString();
console.log('customAttrVal:');
console.log(customAttrVal);
console.log('customAttrVal_string: '+customAttrVal_string);
var preColon = customAttrVal_string.match(regEx_preColon);
preColon_string =preColon.toString();
console.log('preColon');
console.log(preColon);
console.log('preColon_string: '+preColon_string);
var postColon = customAttrVal_string.match(regEx_postColon);
postColon_string = postColon.toString();
console.log('postColon');
console.log(postColon);
console.log('postColon_string: '+postColon_string);
console.log('pre: '+preColon_string);
console.log('post: '+postColon_string);
});
});
</script>
</head>
<body>
<div customAttr="val1:valOne">
Test custom attr
</div>
</body>
</html>

When you use String#match() with a regex with a global modifier, all the capture groups (those strings in the regex101.com right-hand bottom 'MATCH INFORMATION' pane are the values captured into Groups with ID 1 and higher) defined in the pattern are lost, and you only get an array of matched values.
You need to remove /g from your regexps and fix them as follows:
var regEx_attrVal = /[\w:]+(?=;|$)/;
var regEx_preColon = /\w+(?=:)/;
var regEx_postColon = /:(\w+)(?=;|\b)/;
Then, when getting the regEx_postColon captured value, use
var postColon = customAttrVal_string.match(regEx_postColon);
var postColon_string = postColon !== null ? postColon[1] : "";
First, check if there is a postColon regex match, then access the captured value with postColon[1].
See the whole updated code:
$(document).ready(function() {
$('[customAttr]').each(function() {
var attrValues = $(this).attr('customAttr');
var regEx_attrVal = /[\w:]+(?=;|$)/;
var regEx_preColon = /\w+(?=:)/;
var regEx_postColon = /:(\w+)(?=;|\b)/;
var customAttrVal = attrValues.match(regEx_attrVal);
var customAttrVal_string = customAttrVal.toString();
console.log('customAttrVal:');
console.log(customAttrVal);
console.log('customAttrVal_string: ' + customAttrVal_string);
var preColon = customAttrVal_string.match(regEx_preColon);
preColon_string = preColon.toString();
console.log('preColon');
console.log(preColon);
console.log('preColon_string: ' + preColon_string);
var postColon = customAttrVal_string.match(regEx_postColon);
var postColon_string = postColon !== null ? postColon[1] : "";
console.log('postColon');
console.log(postColon);
console.log('postColon_string: ' + postColon_string);
console.log('pre: ' + preColon_string);
console.log('post: ' + postColon_string);
});
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div customAttr="val1:valOne">
Test custom attr
</div>

I haven't trudged through all the code, but something you need to understand about regexes is the difference between $0 and $1.
$0 is highlighted in blue. That is the entire part the regex matched.
You want $1. That's where the matches captured by the parenthesis are.
Read more about capture groups here.
var match = myRegexp.exec(myString);
alert(match[1]); // This accesses $1

use data attributes. you can store json strings in them and access them like objects.
HTML
<div id='div' data-custom='{"val1":"valOne","a":"b"}'></div>
jQ
$("#div").data("custom").val1; //valOne
$("#div").data("custom").a; //b

I guess this is the regex pattern that you're looking for:
(?!(.*?):).*
Explanation
(.*?:) Select all type of values and any number of times and a match that contains (:) simbol
(?! :) select inverse values of the first pattern, its kinda negation
( ).* Select all type of values after the evaluations
Also you can do the same with Jquery substring which for me the most simple way to do it, just like this:
How to substring in jquery

Related

Finding multiple groups in one string

Figure the following string, it's a list of html a separated by commas. How to get a list of {href,title} that are between 'start' and 'end'?
not thisstartfoo, barendnot this
The following regex give only the last iteration of a.
/start((?:<a href="(?<href>.*?)" title="(?<title>.*?)">.*?<\/a>(?:, )?)+)end/g
How to have all the list?
This should give you what you need.
https://regex101.com/r/isYIeR/1
/(?:start)*(?:<a href=(?<href>.*?)\s+title=(?<title>.*?)>.*?<\/a>)+(?:,|end)
UPDATE
This does not meet the requirement.
The Returned Value for a Given Group is the Last One Captured
I do not think this can be done in one regex match. Here is a javascript solution with 2 regex matches to get a list of {href, title}
var sample='startfoo, bar,barendstart<img> something end\n' +
'beginfoo, bar,barend\n'+
'startfoo again, bar again,bar2 againend';
var reg = /start((?:\s*<a href=.*?\s+title=.*?>.*?<\/a>,?)+)end/gi;
var regex2 = /href=(?<href>.*?)\s+title=(?<title>.*?)>/gi;
var step1, step2 ;
var hrefList = [];
while( (step1 = reg.exec(sample)) !== null) {
while((step2 = regex2.exec(step1[1])) !== null) {
hrefList.push({href:step2.groups["href"], title:step2.groups["title"]});
}
}
console.log(hrefList);
If the format is constant - ie only href and title for each tag, you can use this regex to find a string which is not "", and has " and a space or < after it using lookahead (regex101):
const str = 'startfoo, barend';
const result = str.match(/[^"]+(?="[\s>])/gi);
console.log(result);
This regex:
<.*?>
removes all html tags
so for example
<h1>1. This is a title </h1><ul><a href='www.google.com'>2. Click here </a></ul>
After using regex you will get:
1. This is a title 2. Click here
Not sure if this answers your question though.

Regex match cookie value and remove hyphens

I'm trying to extract out a group of words from a larger string/cookie that are separated by hyphens. I would like to replace the hyphens with a space and set to a variable. Javascript or jQuery.
As an example, the larger string has a name and value like this within it:
facility=34222%7CConner-Department-Store;
(notice the leading "C")
So first, I need to match()/find facility=34222%7CConner-Department-Store; with regex. Then break it down to "Conner Department Store"
var cookie = document.cookie;
var facilityValue = cookie.match( REGEX ); ??
var test = "store=874635%7Csomethingelse;facility=34222%7CConner-Department-Store;store=874635%7Csomethingelse;";
var test2 = test.replace(/^(.*)facility=([^;]+)(.*)$/, function(matchedString, match1, match2, match3){
return decodeURIComponent(match2);
});
console.log( test2 );
console.log( test2.split('|')[1].replace(/[-]/g, ' ') );
If I understood it correctly, you want to make a phrase by getting all the words between hyphens and disallowing two successive Uppercase letters in a word, so I'd prefer using Regex in that case.
This is a Regex solution, that works dynamically with any cookies in the same format and extract the wanted sentence from it:
var matches = str.match(/([A-Z][a-z]+)-?/g);
console.log(matches.map(function(m) {
return m.replace('-', '');
}).join(" "));
Demo:
var str = "facility=34222%7CConner-Department-Store;";
var matches = str.match(/([A-Z][a-z]+)-?/g);
console.log(matches.map(function(m) {
return m.replace('-', '');
}).join(" "));
Explanation:
Use this Regex (/([A-Z][a-z]+)-?/g to match the words between -.
Replace any - occurence in the matched words.
Then just join these matches array with white space.
Ok,
first, you should decode this string as follows:
var str = "facility=34222%7CConner-Department-Store;"
var decoded = decodeURIComponent(str);
// decoded = "facility=34222|Conner-Department-Store;"
Then you have multiple possibilities to split up this string.
The easiest way is to use substring()
var solution1 = decoded.substring(decoded.indexOf('|') + 1, decoded.length)
// solution1 = "Conner-Department-Store;"
solution1 = solution1.replace('-', ' ');
// solution1 = "Conner Department Store;"
As you can see, substring(arg1, arg2) returns the string, starting at index arg1 and ending at index arg2. See Full Documentation here
If you want to cut the last ; just set decoded.length - 1 as arg2 in the snippet above.
decoded.substring(decoded.indexOf('|') + 1, decoded.length - 1)
//returns "Conner-Department-Store"
or all above in just one line:
decoded.substring(decoded.indexOf('|') + 1, decoded.length - 1).replace('-', ' ')
If you want still to use a regular Expression to retrieve (perhaps more) data out of the string, you could use something similar to this snippet:
var solution2 = "";
var regEx= /([A-Za-z]*)=([0-9]*)\|(\S[^:\/?#\[\]\#\;\,']*)/;
if (regEx.test(decoded)) {
solution2 = decoded.match(regEx);
/* returns
[0:"facility=34222|Conner-Department-Store",
1:"facility",
2:"34222",
3:"Conner-Department-Store",
index:0,
input:"facility=34222|Conner-Department-Store;"
length:4] */
solution2 = solution2[3].replace('-', ' ');
// "Conner Department Store"
}
I have applied some rules for the regex to work, feel free to modify them according your needs.
facility can be any Word built with alphabetical characters lower and uppercase (no other chars) at any length
= needs to be the char =
34222 can be any number but no other characters
| needs to be the char |
Conner-Department-Store can be any characters except one of the following (reserved delimiters): :/?#[]#;,'
Hope this helps :)
edit: to find only the part
facility=34222%7CConner-Department-Store; just modify the regex to
match facility= instead of ([A-z]*)=:
/(facility)=([0-9]*)\|(\S[^:\/?#\[\]\#\;\,']*)/
You can use cookies.js, a mini framework from MDN (Mozilla Developer Network).
Simply include the cookies.js file in your application, and write:
docCookies.getItem("Connor Department Store");

Using .match() on Content of Textarea

I did some searching around and found some issues similar to mine, but I was hoping for a resolution that corrects my pre-existing code.
Here's the codepen if you want to see the whole thing: http://codepen.io/JTBennett/pen/ygyZwE
$('#ST_txt_write').keyup(function(){
var childText = $('#ST_cmp_body').text();
var count = (childText.match(/hi/g) || []).length;
$('#ST_KW').html(count);
});
The issue is that I can't get the matching function to work with a textarea so the user can type whatever they want in the box and that will be what the code seeks to match. I tried making a variable that returns the text content of the box (just like the childText var you see there) but that would only match once and be done instead of counting.
Any help in the right direction would be most appreciated. I'm sorry I'm not very good with jQuery, or anything for that matter.
The example you gave in your question has important differences from your codepen. In your question, you have a regex literal that you are passing into .match(), and in your codepen, you are passing a string into .match(). What you need to do is create a new regular expression using new RegExp() and pass it the global flag.
The code would look as follows:
$('#ST_txt_write').keyup(function(){
var childText = $('#ST_cmp_body').val();
var KW = $('#ST_KW_txt').val();
var regex = new RegExp(KW, 'g'); // <--- This creates a regex from the string in KW
var count = (childText.match(regex) || []).length; // <-- Use regex here instead of string
$('#ST_KW').html(count);
});
Here's a working example:
$(function() {
$('#ST_txt_write').keyup(function() {
var childText = $('#ST_cmp_body').val();
var KW = $('#ST_KW_txt').val();
var regex = new RegExp(KW, 'g');
var count = (childText.match(regex) || []).length;
$('#ST_KW').text(count);
});
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div id="ST_txt_write">
<textarea id="ST_cmp_body" style="height: 200px; width: 300px;"></textarea>
</div>
<div>
Keywords:
<input type="text" id="ST_KW_txt" />
</div>
<div>
# of Keywords Found: <span id="ST_KW"></span>
</div>

Only extract digits with javascript

I need to extract digits only (at least 2 but not more than 3) from an option text in a drop down list and get that shown in an input field. I've read about regexp (http://www.javascriptkit.com/jsref/regexp.shtml) and thought i had it all figured out. But i just can't seem to get it to work.
<script>
function copyEbc() {
var a = document.getElementById("malt"); // the <select>
var onlydig = /\d{1,3}/ // regexp
var option = a.options[a.selectedIndex].text(onlydig); // regexp on options
var txt = document.getElementById("ebc").value;
txt = txt + option;
document.getElementById("ebc").value = txt;
}
</script>
Example, I only want "4" from a selected option with the text "Pilsner 4 EBC".
Am I completely lost here?
Input much appreciated, cheers
You're trying to call the text value as a function (your JS console has probably complained about this).
Match against the regular expression, instead. match[0] will contain the matched text:
var option = a.options[a.selectedIndex].text.match(onlydig)[0];
to simply extract the number(s) from a string use match:
var numberPattern = /\d+/g;
var foo = 'Pilsner 4 EBC';
var numbers = foo.match(numberPattern);
alert(numbers); // alerts the array of numbers
You need to use the "match" method of the regexp object. AFAIK the "text" method you are using do not exist.

Cannot extract parts of a string

I have a string like this : SPList:6E5F5E0D-0CA4-426C-A523-134BA33369D7?SPWeb:C5DD2ADA-E0C4-4971-961F-233789297FE9:.
Using Javascript, I would like to extract the two IDs (which can be different) : 6E5F5E0D-0CA4-426C-A523-134BA33369D7 and C5DD2ADA-E0C4-4971-961F-233789297FE9.
I'm using this regular expression : ^SPList\:(?:[0-9A-Za-z\-]+)\?SPWeb\:(?:[0-9A-Za-z\-]+)\:$.
I expect this expression to extract into two matching groups the two IDs.
By now, my code is :
var input = "SPList:6E5F5E0D-0CA4-426C-A523-134BA33369D7?SPWeb:C5DD2ADA-E0C4-4971-961F-233789297FE9:";
var myregex = /^SPList\:(?:[0-9A-Za-z\-]+)\?SPWeb\:(?:[0-9A-Za-z\-]+)\:$/g;
var match = input.match(myregex);
var listId = match[0];
var webId = match[1];
However, this is not working as expected. The first match contains the whole string, and the second match is undefined.
What is the proper way to extract my ID's?
Here is a jsfiddle that illustrate my issue.
This should suit your needs:
var regex = /^SPList:([0-9A-F-]+)[?]SPWeb:([0-9A-F-]+):$/g;
var match = regex.exec(input);
var listId = match[1];
var webId = match[2];
I simply replaced the non-capturing groups of your initial regex by capturing groups, and used regex.exec(input) instead of input.match(regex) to get the captured data. Also, since the IDs seem to be hexadecimal values, I used A-F instead of A-Z.
try this:
var myregex = /[^\:]([0-9A-Z\-]+)[^\?|\:]/g;
var match = input.match(myregex);
alert("listID: " + match[1] + "\n" + "webID: " + match[3]);

Categories

Resources