I have this string:
foo = "moon is.white #small"
I want to convert it to this array using rules such as after "." and "#" split and add & if there is no white space before it. I got the answer in this thread: How to Split string with multiple rules in javascript and this is what I got and it is great:
fooArr[0] = ['moon','is','&.white','#small']
My Question is - How do I make other variations of the string to get multiple-variation Array with these rules
A. split in different positions (sometimes the spaces will split and sometimes 2 words and more will be considers as one - in ALL variations (pay attention that if I don't split words with "." or "#" then I don't add "&".
fooArr[1] = ['moon is','&.white','#small']
fooArr[2] = ['moon is.white','#small']
foorArr[3] = ['moon','is.white','#small']
etc...
B. If there is "." or "#" then I want all variations of order between them -- ".all#is.good" can be --> [.all#is.good] & [.all.good#is] & [.good.all#is] etc... (and I want it combined with variations from the first rule such as [.all,&.good,&#is] & [.all.good,$#is]) so I will have ALL COMBINATIONS OF BOTH A AND B
Eventually I need an array combining all the combinations of A and B (it should give me a lot of variations: fooArr[0]..fooArr[X].
Where do I start?
Use a series of regular expression to find all the places where you want to split the string and replace it with " &". Then split on " ".
Or, use a loop to move through the string create replacements like above. And then split.
Or, use a loop to move through the string and build the array directly.
Related
I am trying to parse txt files with js + regex and my problem is as follows:
I have multiple txt files, and inside each one I need to search for an Id, made by 6 characters (numb + letters)
this is the string inside one of those files:
**IFCPROPERTYSINGLEVALUE('codice sito',$,IFCTEXT('I013FR'),$);**
I need to extract the I013FR only, and so far the closest js-regex I wrote is:
(codice sito',\$,IFCTEXT\('[a-zA-Z\d]{6})
using that, I get in return:
codice sito',$,IFCTEXT('I372TO
now I need to "add something" at the end of the regex, in order to only take the last 6 characters from the match.
Is that possible? am I on the right way? or maybe there is another better way to do that?
To extract the sequence of symbols, you need to put it in parenthesis. This pattern is called a "capturing group". Read more
/codice sito',\$,IFCTEXT\('([a-zA-Z\d]{6})/g
And then you can get your id using RegExp.exec() method.
const str = "**IFCPROPERTYSINGLEVALUE('codice sito',$,IFCTEXT('I013FR'),$);**";
const regex = /codice sito',\$,IFCTEXT\('([a-zA-Z\d]{6})/g;
const id = regex.exec(str)[1];
I'm trying to split a string by either three or more pound signs or three or more spaces.
I'm using a function that looks like this:
var produktDaten = dataMatch[0].replace(/\x03/g, '').trim().split('/[#\s]/{3,}');
console.log(produktDaten + ' is the data');
I need to clean the data up a bit, hence the replace and trim.
The output I'm getting looks like this:
##########################################################################MA-KF6###Beckhoff###EL1808 BECK.EL1808###MA-KF7###Beckhoff###EL1808 BECK.EL1808###MA-KF12###Beckhoff###EL1808 BECK.EL1808###MA-KF13###Beckhoff###EL1808 BECK.EL1808###MA-KF14###Beckhoff###EL1808 BECK.EL1808###MA-KF15###Beckhoff###EL1808 BECK.EL1808###MA-KF16###Beckhoff###EL1808 BECK.EL1808###MA-KF19###Beckhoff###EL1808 BECK.EL1808 is the data
How is this possible? Irrespective of the input, shouldn't the pound and multiple spaces be deleted by the split?
You passed a string to the split, the input string does not contain that string. I think you wanted to use
/[#\s]{3,}/
like here:
var produktDaten = "##########################################################################MA-KF6###Beckhoff###EL1808 BECK.EL1808###MA-KF7###Beckhoff###EL1808 BECK.EL1808###MA-KF12###Beckhoff###EL1808 BECK.EL1808###MA-KF13###Beckhoff###EL1808 BECK.EL1808###MA-KF14###Beckhoff###EL1808 BECK.EL1808###MA-KF15###Beckhoff###EL1808 BECK.EL1808###MA-KF16###Beckhoff###EL1808 BECK.EL1808###MA-KF19###Beckhoff###EL1808 BECK.EL1808";
console.log(produktDaten.replace(/\x03/g, '').trim().split(/[#\s]{3,}/));
This /[#\s]{3,}/ regex matches 3 or more chars that are either # or whitespace.
NOTE: just removing ' around it won't fix the issue since you are using an unescaped / and quantify it. You actually need to quantify the character class, [#\s].
Say I have a string like
"item:(one|two|three), item2:(x|y)"
Is there a single regex that could "factor" it into
"item:one, item:two, item:three, item2:x, item2:y"
Or must I resort to splitting and looping?
If I must split it up then how do I even turn
"item:(one|two|three)"
into
"item:one, item:two, item:three"
if the amount of things between the parentheses is variable? Are regexes useless for such a problem?
You could do it with a callback function:
str = str.replace(/(\w+):\(([^)]*)\)/gi, function(match,item,values)
{return item + ':' + values.split("|").join(', '+item+':')}
);
For every item, the first parentheses in the regex capture the item's name (i.e item) and the second set of (unescaped) parentheses capture the string of all values (i.e one|two|three). The latter are then split at | and joined together with , itemname: and then there is another item name appended to the beginning of the result.
This is probably the easiest way to combine regexes to find your data and split and join to build your new regex. The problem why it is not easier is, that you cannot capture an arbitrary number of consecutive values (one|two|three) in different capturing groups. You would only get the last one, if you tried to capture them individually.
I am trying to extract the first character after the last underscore in a string with an unknown number of '_' in the string but in my case there will always be one, because I added it in another step of the process.
What I tried is this. I also tried the regex by itself to extract from the name, but my result was empty.
var s = "XXXX-XXXX_XX_DigitalF.pdf"
var string = match(/[^_]*$/)[1]
string.charAt(0)
So the final desired result is 'D'. If the RegEx can only get me what is behind the last '_' that is fine because I know I can use the charAt like currently shown. However, if the regex can do the whole thing, even better.
If you know there will always be at least one underscore you can do this:
var s = "XXXX-XXXX_XX_DigitalF.pdf"
var firstCharAfterUnderscore = s.charAt(s.lastIndexOf("_") + 1);
// OR, with regex
var firstCharAfterUnderscore = s.match(/_([^_])[^_]*$/)[1]
With the regex, you can extract just the one letter by using parentheses to capture that part of the match. But I think the .lastIndexOf() version is easier to read.
Either way if there's a possibility of no underscores in the input you'd need to add some additional logic.
Say I have a load of strings that follow the same sort of structure as this:
Outcome 1: - Be able to create 2D animations for use as part of an interactive media product.
I want to get everything before the '-' and put it into one variable, and everything after the '-' and put it into another variable. So output is as so:
$1 = "Outcome 1";
$2 = "Be able to create 2D animations for use as part of an interactive media product.";
Thanks
(Also does anyone know how I would then remove the title tag from the following selector?
$$('span[title]').each(function(element) {
});
You can split a string using regular expressions. In your case, you want to:
Get rid of the colon (:)
Get rid of the extra space surrounding the dash (-)
So:
var tokens = s.split(/:\s*-\s*/);
// tokens[0] will be the first part
// tokens[1] the second
var string = "Outcome 1: - Be able to create 2D animations for use as part of an interactive media product."
var strArr = string.split("-");
RESULTS:
strArr[0] == "Outcome 1: "
strArr[1] == " Be able to create 2D animations for use as part of an interactive media product."
Fiddle: http://jsfiddle.net/maniator/VqcPJ/
This regex will remove the trailing colon on the first element and any whitespace surrounding the dash as well:
var parts = str.split(/\s*:\s*-\s*/);
parts; // => ['Outcome 1', 'Be able to create...']