Splitting string with javascript using '>' character - javascript

I acknowledge that this question has probably been asked so many times before and I have tried searching all over StackOverflow for a solution, but so far nothing has worked for me.
I want to split a string but it's not working properly and spitting out individual characters as each item in an array. The string I have from my CMS uses ">" characters to separate and I am using regEx to replace the 'greater than' symbol - with a comma, which works. Sourced this solution from Regex that detects greater than ">" and less than "<" in a string
However, the arrays remain incorrectly formed, like the split() function does not even work:
var myString = "TEST Public Libraries Connect > News Blog > A new item"
var regEx = /<|>/g;
var myNewString = (myString.replace(regEx,","))
alert(myNewString);
myNewString.split(",");
alert(myNewString[0]);
alert(myNewString[1]);
alert(myNewString[2]);
I've put it up in a Fiddle as well, just confused as to why the split won't work properly. Is it because there is spaces in the string?

This should work:
var myNewString = myString.split(">");
https://jsfiddle.net/2j56cva0/3/
In your fiddle, you were splitting myNewString instead of the actual string.

myNewString.split(",");
You need to assign the result of the split to something. It does not just change the string itself into an array.
var parts = myNewString.split(",");

Related

Javascript regex to find a particular string which could be broken across multiple lines

I'm trying to write a regex which gets a particular substring from a string. However, the substring could be broken across multiple lines.
I've tried using the multiline flag, like this:
"foo\nbar".match(/foobar/m)
But that returns null.
I've also seen a number of posts suggesting I use [\S\s]. However, as far as I can tell, this only works if you know where the break line will be, like this:
'foo\nbar'.match(/foo[\S\s]bar/m)
Is there a way to find all instaces of foobar in a string when the line break could anywhere in the string?
Is there a way to find all instances of foobar in a string when the line break could anywhere in the string?
Remove all line-breaks from subject before comparing with your regex.
See this simple demo:
const arr = ["foo\nbar", "\nfoobar", "fo\nobar", "foobar\n", "foobar"];
const val = 'foobar';
arr.forEach(function(el) {
console.log(el.replace(/\n/, '') == val)
});

why isn't this javascript regex split function working?

I'm trying to split a string by either three or more pound signs or three or more spaces.
I'm using a function that looks like this:
var produktDaten = dataMatch[0].replace(/\x03/g, '').trim().split('/[#\s]/{3,}');
console.log(produktDaten + ' is the data');
I need to clean the data up a bit, hence the replace and trim.
The output I'm getting looks like this:
##########################################################################MA-KF6###Beckhoff###EL1808 BECK.EL1808###MA-KF7###Beckhoff###EL1808 BECK.EL1808###MA-KF12###Beckhoff###EL1808 BECK.EL1808###MA-KF13###Beckhoff###EL1808 BECK.EL1808###MA-KF14###Beckhoff###EL1808 BECK.EL1808###MA-KF15###Beckhoff###EL1808 BECK.EL1808###MA-KF16###Beckhoff###EL1808 BECK.EL1808###MA-KF19###Beckhoff###EL1808 BECK.EL1808 is the data
How is this possible? Irrespective of the input, shouldn't the pound and multiple spaces be deleted by the split?
You passed a string to the split, the input string does not contain that string. I think you wanted to use
/[#\s]{3,}/
like here:
var produktDaten = "##########################################################################MA-KF6###Beckhoff###EL1808 BECK.EL1808###MA-KF7###Beckhoff###EL1808 BECK.EL1808###MA-KF12###Beckhoff###EL1808 BECK.EL1808###MA-KF13###Beckhoff###EL1808 BECK.EL1808###MA-KF14###Beckhoff###EL1808 BECK.EL1808###MA-KF15###Beckhoff###EL1808 BECK.EL1808###MA-KF16###Beckhoff###EL1808 BECK.EL1808###MA-KF19###Beckhoff###EL1808 BECK.EL1808";
console.log(produktDaten.replace(/\x03/g, '').trim().split(/[#\s]{3,}/));
This /[#\s]{3,}/ regex matches 3 or more chars that are either # or whitespace.
NOTE: just removing ' around it won't fix the issue since you are using an unescaped / and quantify it. You actually need to quantify the character class, [#\s].

Regex to capture everything but consecutive newlines

What is the best way to capture everything except when faced with two or more new lines?
ex:
name1
address1
zipcode
name2
address2
zipcode
name3
address3
zipcode
One regex I considered was /[^\n\n]*\s*/g. But this stops when it is faced with a single \n character.
Another way I considered was /((?:.*(?=\n\n)))\s*/g. But this seems to only capture the last line ignoring the previous lines.
What is the best way to handle similar situation?
UPDATE
You can consider replacing the variable length separator with some known fixed length string not appearing in your processed text and then split. For instance:
> var s = "Hi\n\n\nBye\nCiao";
> var x = s.replace(/\n{2,}/, "#");
> x.split("#");
["Hi", "Bye
Ciao"]
I think it is an elegant solution. You could also use the following somewhat contrived regex
> s.match(/((?!\n{2,})[\s\S])+/g);
["Hi", "
Bye
Ciao"]
and then process the resulting array by applying the trim() string method to its members in order to get rid of any \n at the beginning/end of every string in the array.
((.+)\n?)*(you probably want to make the groups non-capturing, left it as is for readability)
The inner part (.+)\n? means "non-empty line" (at least one non-newline character as . does not match newlines unless the appropriate flag is set, followed by an optional newline)
Then, that is repeated an arbitrary number of times (matching an entire block of non-blank lines).
However, depending on what you are doing, regexp probably is not the answer you are looking for. Are you sure just splitting the string by \n\n won't do what you want?
Do you have to use regex? The solution is simple without it.
var data = 'name1...';
var matches = data.split('\n\n');
To access an individual sub section split it by \n again.
//the first section's name
var name = matches[0].split('\n')[0];

javascript regex to extract the first character after the last specified character

I am trying to extract the first character after the last underscore in a string with an unknown number of '_' in the string but in my case there will always be one, because I added it in another step of the process.
What I tried is this. I also tried the regex by itself to extract from the name, but my result was empty.
var s = "XXXX-XXXX_XX_DigitalF.pdf"
var string = match(/[^_]*$/)[1]
string.charAt(0)
So the final desired result is 'D'. If the RegEx can only get me what is behind the last '_' that is fine because I know I can use the charAt like currently shown. However, if the regex can do the whole thing, even better.
If you know there will always be at least one underscore you can do this:
var s = "XXXX-XXXX_XX_DigitalF.pdf"
var firstCharAfterUnderscore = s.charAt(s.lastIndexOf("_") + 1);
// OR, with regex
var firstCharAfterUnderscore = s.match(/_([^_])[^_]*$/)[1]
With the regex, you can extract just the one letter by using parentheses to capture that part of the match. But I think the .lastIndexOf() version is easier to read.
Either way if there's a possibility of no underscores in the input you'd need to add some additional logic.

how to extract this kind of data and put them into a nice array?

I got a string like this one:
var tweet ="#fadil good:))RT #finnyajja: what a nice day RT #fadielfirsta: how are you? #finnyajja yay";
what kind of code should work to extract any words with # character and also removing any special char at the end of the words? so it would an array like this :
(#fadil, #finnyajja, #fadielfirsta, #finnyajja);
i have tried the following code :
var users = $.grep(tweet.split(" "), function(a){return /^#/.test(a)});
it returns this:
(#fadil, #finnyajja:, #fadielfirsta:, #finnyajja)
there's still colon ':' character at the end of some words. What should I do? any solution guys? Thanks
Here is code that is more straightforward than trying to use split:
var tweet_text ="#fadil good:))RT #finnyajja: what a nice day RT #fadielfirsta: how are you? #finnyajja yay";
var result = tweet_text.match(/#\w+/g);
The easiest way without changing your current code too much would be to just remove all colons prior to calling split:
var users = $.grep(tweet_text.replace(":","").split(" "), function(a){return /^#/.test(a)});
You could also write a regex to do all the work for you using match. Something like this:
var regex = /#[a-z0-9]+/gi;
var matches = tweet.match(regex);
This assumes that you only want letters and numbers, if certain other characters are allowed, this regex will need to be modified.
http://jsfiddle.net/YHM87/

Categories

Resources