I'm trying to get an array of JSON objects. To do that, I'm trying to make the input I have parsable, then parse it and push it to that array using a for loop. The inputs I have to work with look like this:
firstname: Chris, lastname: Cheshire, email: chris#cmdcheshire.com, viewerlink: audiencematic.com/viewer?v\u003dTESTSHOW\u0026push\u003d8A043B5A, tempid: 8A043B5A, permaid: F8tGYNx, showid: TESTSHOW
I've gotten it to the point where each loop produces something like this:
{ "firstname": First Name, "lastname": Last Name, "email": sample#gmail.com, "viewerlink": audiencematic.com/viewer?v=TESTSHOW&push=715B3074, "tempid": 715B3074, "permaid": F8tGYNx, "showid": TESTSHOW }
But got stuck on the last bit, making the values strings. I want it to look like this, so I can use JSON.parse():
{ "firstname": "First Name", "lastname": "Last Name", "email": "sample#gmail.com", "viewerlink": "audiencematic.com/viewer?v=TESTSHOW&push=715B3074", "tempid": "715B3074", "permaid": "F8tGYNx", "showed": "TESTSHOW" }
I tried a couple of different methods I found on here, but one of the values is a URL and the period is screwing with the replace expressions. I tried using the replace function like this:
var jsonStr2 = jsonStr.replace(/(: +\w)|(:+\w)/g, function(matchedStr) {
return ':"' + matchedStr.substring(2, matchedStr.length) + '"';
});
But it just becomes this:
{ "firstname":""irst Name, "lastname":""ast Name, "email":""ample#gmail.com, "viewerlink":""udiencematic.com/viewer?v=TESTSHOW&push=715B3074, "tempid":""15B3074, "permaid":""8tGYNx, "showid":""ESTSHOW }
How should I change my replace function?
(I tried that code because I'm using
var jsonStr = string.replace(/(\w+:)|(\w+ :)/g, function(matchedStr) {
return '"' + matchedStr.substring(0, matchedStr.length - 1) + '":';
});
to put parenthesis around the key sides and that seems to work.)
FIGURED IT OUT!! SEE MY ANSWER BELOW.
One option might be to try using a deserialized version of the string, alter the values associated with the properties of the object, and then convert back to a string.
var person = "{fname:\"John\", lname:\"Doe\", age:25}";
var obj = JSON.parse(person);
for (x in obj) {
obj[x] = "";
}
var result = JSON.stringify(obj);
It's a little longer than doing a string replacement, but I find it a little easier to follow.
I figured it out! I just had to mess around in regexr to figure out what conditions I needed. Here's the working for loop code:
for (i = 0; i < audiencelistdirty.feed.openSearch$totalResults.$t; i++) {
var string = '{ ' + audiencelistdirty.feed.entry[i].content.$t + ' }';
var jsonStr = string.replace(/(\w+:)|(\w+ :)/g, function(matchedStr) {
return '"' + matchedStr.substring(0, matchedStr.length - 1) + '":';
});
var jsonStr1 = jsonStr.replace(/(:(.*?),)|(:\s(.*?)\s)/g, function(matchedStr) {
return ':"' + matchedStr.substring(2, matchedStr.length - 1) + '",';
});
var jsonStr2 = jsonStr1.replace(/(",})/g, function(matchedStr) {
return '" }';
});
var newObj = JSON.parse(jsonStr2);
audiencelist.push(newObj);
};
It's pretty ugly but it works.
EDIT: Sorry, I completely misread the question. To replace the values with quoted strings use this regex replace function:
const str =
'firstname: Chris, lastname: Cheshire, email: chris#cmdcheshire.com, viewerlink: audiencematic.com/viewer?v\u003dTESTSHOW\u0026push\u003d8A043B5A, tempid: 8A043B5A, permaid: F8tGYNx, showid: TESTSHOW'
const json = (() => {
const result = str
.replace(/\w+:\s(.*?)(?:,|$)/g, function (match, subStr) {
return match.replace(subStr, `"${subStr}"`)
})
.replace(/(\w+):/g, function (match, subStr) {
return match.replace(subStr, `"${subStr}"`)
})
return '{' + result + '}'
})()
Wrap the input string into commas then use a regex to identify the keys (between , and :) and their associated values (between : and ,) and construct the object directly as in the example below:
const input = ' firstname : Chris , lastname: Cheshire, email: chris#cmdcheshire.com, viewerlink: audiencematic.com/viewer?v\u003dTESTSHOW\u0026push\u003d8A043B5A, tempid: 8A043B5A, permaid: F8tGYNx, showid: TESTSHOW ';
const wrapped = `,${input},`;
const re = /,\s*([^:\s]*)\s*:\s*(.*?)\s*(?=,)/g;
const obj = {}
Array.from(wrapped.matchAll(re)).forEach((match) => obj[match[1]] = match[2]);
console.log(obj)
String.matchAll() is a newer function, not all JavaScript engines have implemented it yet. If you are one of the unlucky ones (or if you write code to be executed in a browser) then you can use the old-school way:
const input = ' firstname : Chris , lastname: Cheshire, email: chris#cmdcheshire.com, viewerlink: audiencematic.com/viewer?v\u003dTESTSHOW\u0026push\u003d8A043B5A, tempid: 8A043B5A, permaid: F8tGYNx, showid: TESTSHOW ';
const wrapped = `,${input},`;
const re = /,\s*([^:\s]*)\s*:\s*(.*?)\s*(?=,)/g;
const obj = {}
let match = re.exec(wrapped);
while (match) {
obj[match[1]] = match[2];
match = re.exec(wrapped);
}
console.log(obj);
The anatomy of the regex used above
The regular expression piece by piece:
/ # regex delimiter; not part of the regex but JavaScript syntax
, # match a comma
\s # match a white space character (space, tab, new line)
* # the previous symbol zero or more times
( # start the first capturing group; does not match anything
[ # start a character class...
^ # ... that matches any character not listed inside the class
: # ... i.e. any character but semicolon...
\s # ... and white space character
] # end of the character class; the entire class matches only one character
* # the previous symbol zero or more times
) # end of the first capturing group; does not match anything
\s*:\s* # zero or more spaces before and after the semicolon
( # start of the second capturing group
.* # any character, any number of times; this is greedy by default
? # make it not greedy
) # end of the second capturing group
\s* # zero or more spaces
(?= # lookahead positive assertion; matches but does not consume the matched substring
, # matches a comma
) # end of the assertion
/ # regex delimiter; not part of the regex but JavaScript
g # regex flag; 'g' for 'global' is needed to find all matches
Read about the syntax of regular expressions in JavaScript. For a more comprehensive description of the regex patterns I recommend reading the PHP documentation of PCRE (Perl-Compatible Regular Expressions).
You can see the regex in action and play with it on regex101.com.
Related
Regex to fetch all spaces as long as they are not enclosed in braces
This is for a javascript mention system
ex: "Speak #::{Joseph Empyre}{b0268efc-0002-485b-b3b0-174fad6b87fc}, all right?"
Need to get:
[ "Speak ", "#::{Joseph
Empyre}{b0268efc-0002-485b-b3b0-174fad6b87fc}", ",", "all ", "right?"
]
[Edit]
Solved in: https://codesandbox.io/s/rough-http-8sgk2
Sorry for my bad english
I interpreted your question as you said to to fetch all spaces as long as they are not enclosed in braces, although your result example isn't what I would expect. Your example result contains a space after speak, as well as a separate match for the , after the {} groups. My output below shows what I would expect for what I think you are asking for, a list of strings split on just the spaces outside of braces.
const str =
"Speak #::{Joseph Empyre}{b0268efc-0002-485b-b3b0-174fad6b87fc}, all right?";
// This regex matches both pairs of {} with things inside and spaces
// It will not properly handle nested {{}}
// It does this such that instead of capturing the spaces inside the {},
// it instead captures the whole of the {} group, spaces and all,
// so we can discard those later
var re = /(?:\{[^}]*?\})|( )/g;
var match;
var matches = [];
while ((match = re.exec(str)) != null) {
matches.push(match);
}
var cutString = str;
var splitPieces = [];
for (var len=matches.length, i=len - 1; i>=0; i--) {
match = matches[i];
// Since we have matched both groups of {} and spaces, ignore the {} matches
// just look at the matches that are exactly a space
if(match[0] == ' ') {
// Note that if there is a trailing space at the end of the string,
// we will still treat it as delimiter and give an empty string
// after it as a split element
// If this is undesirable, check if match.index + 1 >= cutString.length first
splitPieces.unshift(cutString.slice(match.index + 1));
cutString = cutString.slice(0, match.index);
}
}
splitPieces.unshift(cutString);
console.log(splitPieces)
Console:
["Speak", "#::{Joseph Empyre}{b0268efc-0002-485b-b3b0-174fad6b87fc},", "all", "right?"]
I have a rather isolated situation in an inventory management program where our shelf locations have a specific format, which is always Letter: Number-Letter-Number, such as Y: 1-E-4. Most of us coworkers just type in "y1e4" and are done with it, but that obviously creates issues with inconsistent formats in a database. Are JS RegExp's the ideal way to automatically detect and format these alphanumeric strings? I'm slowly wrapping my head around JavaScript's Perl syntax, but what's a simple example of formatting one of these strings?
spec: detect string format of either "W: D-W-D" or "WDWD" and return "W: D-W-D"
This function will accept any format and return undefined if it doesnt match, returns the formatted string if a match does occur.
function validateInventoryCode(input) {
var regexp = /^([a-zA-Z]+)(?:\:\s*)?(\d+)-?(\w+)-?(\d+)$/
var r = regexp.exec(input);
if(r != null) {
return `${r[1]}: ${r[2]}-${r[3]}-${r[4]}`;
}
}
var possibles = ["y1e1", "y:1e1", "Y: 1r3", "y: 32e4", "1:e3e"];
possibles.forEach(function(posssiblity) {
console.log(`input(${posssiblity}), result(${validateInventoryCode(posssiblity)})`);
})
function validateInventoryCode(input) {
var regexp = /^([a-zA-Z]+)(?:\:\s*)?(\d+)-?(\w+)-?(\d+)$/
var r = regexp.exec(input);
if (r != null) {
return `${r[1]}: ${r[2]}-${r[3]}-${r[4]}`;
}
}
I understand the question as "convert LetterNumberLetterNumber to Letter: Number-Letter-Number.
You may use
/^([a-z])(\d+)([a-z])(\d+)$/i
and replace with $1: $2-$3-$4
Details:
^ - start of string
([a-z]) - Group 1 (referenced with $1 from the replacement pattern) capturing any ASCII letter (as /i makes the pattern case-insensitive)
(\d+) - Group 2 capturing 1 or more digits
([a-z]) - Group 3, a letter
(\d+) - Group 4, a number (1 or more digits)
$ - end of string.
See the regex demo.
var re = /^([a-z])(\d+)([a-z])(\d+)$/i;
var s = 'y1e2';
var result = s.replace(re, '$1: $2-$3-$4');
console.log(result);
OR - if the letters must be turned to upper case:
var re = /^([a-z])(\d+)([a-z])(\d+)$/i;
var s = 'y1e2';
var result = s.replace(re,
(m,g1,g2,g3,g4)=>`${g1.toUpperCase()}: ${g2}-${g3.toUpperCase()}-${g4}`
);
console.log(result);
this is the function to match and replace the pattern: DEMO
function findAndFormat(text){
var splittedText=text.split(' ');
for(var i=0, textLength=splittedText.length; i<textLength; i++){
var analyzed=splittedText[i].match(/[A-z]{1}\d{1}[A-z]{1}\d{1}$/);
if(analyzed){
var formattedString=analyzed[0][0].toUpperCase()+': '+analyzed[0][1]+'-'+analyzed[0][2].toUpperCase()+'-'+analyzed[0][3];
text=text.replace(splittedText[i],formattedString);
}
}
return text;
}
i think it's just as it reads:
y1e4
Letter, number, letter, number:
/([A-z][0-9][A-z][0-9])/g
And yes, it's ok to use regex in this case, like form validations and stuff like that. it's just there are some cases on which abusing of regular expressions gives you a bad performance (into intensive data processing and the like)
Example
"HelloY1E4world".replace(/([A-z][0-9][A-z][0-9])/g, ' ');
should return: "Hello world"
regxr.com always comes in handy
I need to parse in javascript the value entered by the user in an html text field.
That is my first regexp experience.
Here is my code :
var s = 'research library "not available" author:"Bernard Shaw"';
var tableau = s.split(/(?:[^\s"]+|"[^"]*")/);
for (var i=0; i<tableau.length; i++) {
document.write("tableau[" + i + "] = " + tableau[i] + "<BR>");
}
I am expecting to see something like this:
tableau[0] = research
tableau[1] = library
tableau[2] = "not available"
tableau[3] = author:
tableau[4] = "Bernard Shaw"
But instead I got this:
tableau[0] =
tableau[1] =
tableau[2] =
tableau[3] =
tableau[4] =
tableau[5] =
Actually, what I really need is to split this value :
research library "not available" author:"Bernard Shaw"
into this array :
tableau[0] = research
tableau[1] = library
tableau[2] = "not available"
tableau[3] = author:"Bernard Shaw"
But I think there is a problem with positive lookbehind in javascript or something like this.
I did many tries without more success:
How do I split a string with multiple separators in javascript?
Regex split string preserving quotes
Positive look behind in JavaScript regular expression
javascript split string by space, but ignore space in quotes (notice not to split by the colon too)
I think I really need some help...
It seems like you want to split on the whitespace outside the double-quotes. In that case you can try this regex:
var tableau = s.split(/\s(?=(?:[^"]*"[^"]*")*[^"]*$)/);
this will split on whitespace, followed by an even number of double quotes.
Explanation:
\s # Split on whitespace
(?= # Followed by
(?: # Non-capture group with 2 quotes
[^"]* # 0 or more non-quote characters
" # 1 quote
[^"]* # 0 or more non-quote characters
" # 1 quote
)* # 0 or more repetition of previous group(multiple of 2 quotes will be even)
[^"]* # Finally 0 or more non-quotes
$ # Till the end (This is necessary)
)
This will give you your final desired output:
tableau[0] = research
tableau[1] = library
tableau[2] = "not available"
tableau[3] = author:"Bernard Shaw"
Regex might not be the way to go. Instead, you might write a tiny parser that marches along a character at a time and builds an array. Something like this (http://jsfiddle.net/WTMct/1):
function parse(str) {
var arr = [];
var quote = false; // true means we're inside a quoted field
// iterate over each character, keep track of current field index (i)
for (var i = c = 0; c < str.length; c++) {
var cc = str[c], nc = str[c+1]; // current character, next character
arr[i] = arr[i] || ''; // create a new array value (start with empty string) if necessary
// If it's just one quotation mark, begin/end quoted field
if (cc == '"') { quote = !quote; continue; }
// If it's a space, and we're not in a quoted field, move on to the next field
if (cc == ' ' && !quote) { ++i; continue; }
// Otherwise, append the current character to the current field
arr[i] += cc;
}
return arr;
}
Then
parse('research library "not available" author:"Bernard Shaw"')
returns ["research", "library", "not available", "author:Bernard Shaw"].
You can also match the string
var output=s.match(/"[^"]*"|\S+/g);
I have the following array of data named cityList:
var cityList = [
"Anaa, French Polynesia (AAA)",
"Arrabury, Australia (AAB)",
"Al Arish, Egypt (AAC)",
"Ad-Dabbah, Sudan (AAD)",
"Annaba, Algeria (AAE)",
"Apalachicola, United States (AAF)",
"Arapoti, Brazil (AAG)",
"Aachen, Germany (AAH)",
"Arraias, Brazil (AAI)",
"Awaradam, Suriname (AAJ)",
"Aranuka, Kiribati (AAK)",
"Aalborg, Denmark (AAL)"
];
I want to first search the city name starting at the beginning of the string.
Next I want to search the code portion of the string: AAA, AAB, AAC, etc...
I want to apply a search pattern as a javascript regular expression, first to the city name, and second to the city code.
Here are my regular expressions:
// this regular expression used for search city name
var matcher = new RegExp("^" + re, "i");
// this regular expression used for search city code
var matcher = new RegExp("([(*)])" + re, "i");
How do I combine these two regular expressions into a single regex that works as described?
I suggest this:
var myregexp = /^([^,]+),[^(]*\(([^()]+)\)/;
var match = myregexp.exec(subject);
if (match != null) {
city = match[1];
code = match[2];
}
Explanation:
^ # Start of string
( # Match and capture (group number 1):
[^,]+ # One or more characters except comma (alternatively insert city name)
) # End of group 1
, # Match a comma
[^(]* # Match any number of characters except an opening parenthesis
\( # Match an opening parenthesis
( # Match and capture (group number 2):
[^()]+ # One or more characters except parentheses (alt. insert city code)
) # End of group 2
\) # Match a closing parenthesis
This assumes that no city name will ever contain a comma (otherwise this regex would only capture the part before the comma), so you'd need to check your data if that's ever possible. I can't think of an example, but that's not saying anything :)
$("#leavingCity").autocomplete({
source: function(req, responseFn) {
var re = $.ui.autocomplete.escapeRegex(req.term);
var matcher = new RegExp("/^([^,]+),[^(]*\(([^()]+)\)/", "g");
var a = $.grep(cityList, function(item,index) { return matcher.test(item); });
responseFn(a);
} });
Try this, regualr expression by Tim Pietzcker
This is the most elegant way I can do it:
var cityList = ["Anaa, French Polynesia (AAA)","Arrabury, Australia (AAB)","Al Arish, Egypt (AAC)","Ad-Dabbah, Sudan (AAD)","Annaba, Algeria (AAE)","Apalachicola, United States (AAF)","Arapoti, Brazil (AAG)","Aachen, Germany (AAH)","Arraias, Brazil (AAI)","Awaradam, Suriname (AAJ)","Aranuka, Kiribati (AAK)","Aalborg, Denmark (AAL)"];
var regex = /([a-z].+?),.+?\(([A-Z]{3,3})\)/gi, match, newList = [];
while (match = regex.exec(cityList)) {
newList.push(match[1]+" - "+match[2]);
}
alert(newList[7]);
// prints Aachen - AAH
If you don't understand how to use parentheses in your regex, I suggest you check out the site I learned from: http://www.regular-expressions.info/
Here I suggest a completly different approach (ECMA-262 standard).
As using the regex requires a linear search anyway, if you can pre-process the data, you can set up an array of city objects:
function City(name, country, code){
this.cityName = name;
this.cityCountry = country;
this.cityCode = code;
}
var cities = [];
cities.push(new City('Anaa', 'French Polynesia', 'AAA'));
// ... push the other cities
And a search function:
function GetCity(cityToSearch, cities){
var res = null;
for(i=0;i<cities.length;i++){
if(cities[i].city = cityToSearch
res = cities[i];
}
return res;
}
At run time:
var codeFound = '';
var cityFound = GetCity('Arraias');
if(cityFound != null)
codeFound = cityFound.cityCode;
Remark
In both case, if you are going to fill the cities array with all city of the world, the city name is not a key! For instance there are half a dozen of 'Springfield' in USA. In that case a better approach is to use a two-fields key.
I think you want to accomplish this in a few simple steps:
Split each string in your array before and after the first parenthesis
Apply your first regex to the first part of the string. Store the result as a boolean variable, perhaps named matchOne
Apply your second regex to the second part of the string (don't forget to remove the closing parenthesis). Store the result as a boolean variable, perhaps named matchTwo.
Test if either of the two mathes succeeded: return ( matchOne || matchTwo );
Use indexOf
Its more efficient and explicit of expectation. regex is unnecessary.
const isMatchX = cityList.indexOf('AAB');
const isMatchY = cityList.indexOf('Awar');
Alternatively you could so something like this but its way overkill when you can use indexOf:
const search = (cityList, re) => {
const strRegPart1 = "¬[^¬]*" + re + "[^¬]*";
const strRegPart2 = "¬[^¬]*\\([^\\)]*" + re + "[^\\)]*\\)($|¬)";
const regSearch = RegExp("(" + strRegPart1 + "|" + strRegPart2 + ")", "gi");
const strCityListMarked = '¬' + cityList.join('¬');
const arrMatch = strCityListMarked.match(regSearch);
return arrMatch && arrMatch[1].substr(1);
}
text = '#container a.filter(.top).filter(.bottom).filter(.middle)';
regex = /(.*?)\.filter\((.*?)\)/;
matches = text.match(regex);
log(matches);
// matches[1] is '#container a'
//matchss[2] is '.top'
I expect to capture
matches[1] is '#container a'
matches[2] is '.top'
matches[3] is '.bottom'
matches[4] is '.middle'
One solution would be to split the string into #container a and rest. Then take rest and execute recursive exec to get item inside ().
Update: I am posting a solution that does work. However I am looking for a better solution. Don't really like the idea of splitting the string and then processing
Here is a solution that works.
matches = [];
var text = '#container a.filter(.top).filter(.bottom).filter(.middle)';
var regex = /(.*?)\.filter\((.*?)\)/;
var match = regex.exec(text);
firstPart = text.substring(match.index,match[1].length);
rest = text.substring(matchLength, text.length);
matches.push(firstPart);
regex = /\.filter\((.*?)\)/g;
while ((match = regex.exec(rest)) != null) {
matches.push(match[1]);
}
log(matches);
Looking for a better solution.
This will match the single example you posted:
<html>
<body>
<script type="text/javascript">
text = '#container a.filter(.top).filter(.bottom).filter(.middle)';
matches = text.match(/^[^.]*|\.[^.)]*(?=\))/g);
document.write(matches);
</script>
</body>
</html>
which produces:
#container a,.top,.bottom,.middle
EDIT
Here's a short explanation:
^ # match the beginning of the input
[^.]* # match any character other than '.' and repeat it zero or more times
#
| # OR
#
\. # match the character '.'
[^.)]* # match any character other than '.' and ')' and repeat it zero or more times
(?= # start positive look ahead
\) # match the character ')'
) # end positive look ahead
EDIT part II
The regex looks for two types of character sequences:
one ore more characters starting from the start of the string up to the first ., the regex: ^[^.]*
or it matches a character sequence starting with a . followed by zero or more characters other than . and ), \.[^.)]*, but must have a ) ahead of it: (?=\)). This last requirement causes .filter not to match.
You have to iterate, I think.
var head, filters = [];
text.replace(/^([^.]*)(\..*)$/, function(_, h, rem) {
head = h;
rem.replace(/\.filter\(([^)]*)\)/g, function(_, f) {
filters.push(f);
});
});
console.log("head: " + head + " filters: " + filters);
The ability to use functions as the second argument to String.replace is one of my favorite things about Javascript :-)
You need to do several matches repeatedly, starting where the last match ends (see while example at https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference/Global_Objects/RegExp/exec):
If your regular expression uses the "g" flag, you can use the exec method multiple times to find successive matches in the same string. When you do so, the search starts at the substring of str specified by the regular expression's lastIndex property. For example, assume you have this script:
var myRe = /ab*/g;
var str = "abbcdefabh";
var myArray;
while ((myArray = myRe.exec(str)) != null)
{
var msg = "Found " + myArray[0] + ". ";
msg += "Next match starts at " + myRe.lastIndex;
print(msg);
}
This script displays the following text:
Found abb. Next match starts at 3
Found ab. Next match starts at 9
However, this case would be better solved using a custom-built parser. Regular expressions are not an effective solution to this problem, if you ask me.
var text = '#container a.filter(.top).filter(.bottom).filter(.middle)';
var result = text.split('.filter');
console.log(result[0]);
console.log(result[1]);
console.log(result[2]);
console.log(result[3]);
text.split() with regex does the trick.
var text = '#container a.filter(.top).filter(.bottom).filter(.middle)';
var parts = text.split(/(\.[^.()]+)/);
var matches = [parts[0]];
for (var i = 3; i < parts.length; i += 4) {
matches.push(parts[i]);
}
console.log(matches);