How do I get a list of strings ending in a newline or ending in the end of the string in javascript regex? - javascript

I'm pretty frustrated with regex right now. Given:
var text = "This is a sentence.\nThis is another sentence\n\nThis is the last sentence!"
I want regex to return to me:
{"This is a sentence.\n", "This is another sentence\n\n", "This is the last sentence!"}
I think i should use
var matches = text.match(/.+[\n+\Z]/)
but \Z doesn't seem to work. Does javascript have an end of string matcher?

You can use the following regex.
var matches = text.match(/.+\n*/g);
Working Demo
Or you could match a newline sequence "one or more" times or the end of the string.
var matches = text.match(/.+(?:\n+|$)/g);

Try this one: /(.+\n*)/g
See it here: http://regex101.com/r/wK8oX3/1

If you wanted an array and didn't want to keep the "\n" around you could do...
var strings = text.split("\n");
which would yield
["This is a sentence.", "This is another sentence", "", "This is the last sentence!"]
if you wanted to get rid of that empty string chain a filter onto the split...
var strings = text.split("\n").filter(function(s){ return s !== ""; });
Maybe not what you want tho, also not as efficient as the regex options already proposed.
Edit: as torazaburo pointed out using Boolean as the filter function is cleaner than a callback.
var strings = text.split("\n").filter(Boolean);
Edit Again: I keep getting one upped, using the /\n+/ expression is even cooler.
var strings = text.split(/\n+/);

To get an array of sentences:
var matches = text.match(/.+?(?:(?:\\n)+|$)/g);

You can try this,
text.match(/.+/g)

Related

Why is JavaScript's split() method not splitting on ":"?

So to start off, a bit of context. I am pulling data from the following url: "https://webster.cs.washington.edu/pokedex/pokedex.php?pokedex=all" using a GET method. The data returned is a series of Pokemon names and image names in the following format.
Name1:name1.png
Name2:name2.png
...
The list is 151 items long. When I call the typeOf() method "String" is returned, so I am fairly certain it is a String I am dealing with here. What I would like to do is split the String on the delimiters of "\n" and ":".
What I would like:
Name1,name1.png,Name2,name2.png...
After some experimentation with Regex, I found that the Regex to do this was "\n|:". Using this I wrote the following line to split the String apart. I tested this Regex on https://regex101.com and it seems to work properly there.
var splitData = data.split("\n|:");
("data" is the String I receive from the url.)
But instead of splitting the String and placing the substrings into an array it doesn't do anything. (At least as far as I can see.) As such my next idea was to try replacing the characters that were giving me trouble with another character and then splitting on that new character.
data = data.replace("\n", " ");
data = data.replace("/:/g", " ");
var splitData = data.split(" ");
The first line that replaces new line characters does work, but the second line to replace the ":" does not seem to do anything. So I end up with an array that is filled with Strings that look like this.
Name1:name1.png
I can split these strings by calling their index and then splitting the substring stored within, which only confuses me more.
data = data.replace("\n", " ");
var splitData = data.split(" ");
alert(splitData[0].split(":")[1]);
The above code returns "name1.png".
Am I missing something regarding the split() method? Is my Regex wrong? Is there a better way to achieve what I am attempting to do?
Right now you are splitting on the string literal "\n|:" but to do a regex you want data.split(/[:\n]/)
The MDN page shows two ways to build a Regex:
var regex1 = /\w+/;
var regex2 = new RegExp('\\w+');
The following test script was able to work for me. I decided to use the regex in the split instead of trying to replace tokens in the string. It seemed to do the trick for me.
let testResponse = `Abra:abra.png
Aerodactyl:aerodactyl.png`;
let dataArray = testResponse.split(/\n|:/g);
let commaSeperated = dataArray.join(',');
console.log(commaSeperated);
So you can simply use regex by excluding the quotes all together.
You can look at the documentation here for regular expressions. They give the following examples:
var re = /ab+c/;
var re = new RegExp('ab+c');
See below for your expected output:
var data = `Name1:name1.png
Name2:name2.png`;
var splitData = data.split(/[\n:]/);
console.log(splitData);
//Join them by a comma to get all results
console.log(splitData.join(','));
//For some nice key value pairs, you can reduce the array into an object:
var kvps = data.split("\n").reduce((res, line) => {
var split = line.split(':');
return {
...res,
[split[0]]: split[1]
};
}, {});
console.log(kvps);
I tried and this works good.
str.split(/[:\n]/)
Here is a plunker.
plunker

Javascript string after an expression

The expression str.substring(0, str.indexOf("begin:")).trim() gets me the string before begin: but if I want the string that comes after begin:, what do I do?
Or use the same substring function but use the length as the first parameter (factoring in the length of "begin:") and nothing for the second:
str.substring(str.indexOf("begin:")+6).trim()
As the docs for substring state: "If indexEnd is omitted, substring() extracts characters to the end of the string."
just split the string at begin: and get the next portion:
var strPortion=str.split("begin:");
var desiredString=strPortion[1].trim();
so if the string is: "Now we begin: it is better to try splitting the string";
the above code will give "it is better to try splitting the string";
Not completely sure, but try:
str.substring(str.indexOf("begin:"),str.length).trim()
I'm not sure if you want 'begin' included or discarded.
var needle = "begin:";
var haystack = "begin: This is a story all about how my life got flipped turned upside down";
var phrase = str.substring(str.indexOf(needle) + needle.length).trim();
That will give you what you are looking for. I only am posting the answer due to others comments of the ambiguity and modulation of the code. So yeah.
Check it out and change the needle if you'd like. Code example shows how to get before the needle as well. https://jsfiddle.net/jL6zxcu3/1/
var needle = "begin:"
str.substring(str.indexOf(needle)+ needle.length, str.length).trim()
Or
var needle = "begin:"
str.substring(str.indexOf(needle)+ needle.length).trim()
The second Paramter is not needed if you want search to the end of string

JS RegExp matched entries count

Is it possible to get a count of entries that matched of RegExp in JS?
Let's assume an simpliest example:
var pattern =/\bstring\b/g;
var str = "My the best string comes here. Do you want another one? That will be a string too!";
So how to get its count? If I try use a standard method exec():
pattern.exec(str);
...it shows me an array, contains unique matched entry:
["string"]
there is the lenght 1 of this array, but in reality there are 2 points where matched entry was found.
You can achieve using .match() :
var pattern =/\bstring\b/g;
var str = "My the best string comes here. Do you want another one? That will be a string too!";
alert(str.match(pattern).length);
Demo Fiddle

how to extract this kind of data and put them into a nice array?

I got a string like this one:
var tweet ="#fadil good:))RT #finnyajja: what a nice day RT #fadielfirsta: how are you? #finnyajja yay";
what kind of code should work to extract any words with # character and also removing any special char at the end of the words? so it would an array like this :
(#fadil, #finnyajja, #fadielfirsta, #finnyajja);
i have tried the following code :
var users = $.grep(tweet.split(" "), function(a){return /^#/.test(a)});
it returns this:
(#fadil, #finnyajja:, #fadielfirsta:, #finnyajja)
there's still colon ':' character at the end of some words. What should I do? any solution guys? Thanks
Here is code that is more straightforward than trying to use split:
var tweet_text ="#fadil good:))RT #finnyajja: what a nice day RT #fadielfirsta: how are you? #finnyajja yay";
var result = tweet_text.match(/#\w+/g);
The easiest way without changing your current code too much would be to just remove all colons prior to calling split:
var users = $.grep(tweet_text.replace(":","").split(" "), function(a){return /^#/.test(a)});
You could also write a regex to do all the work for you using match. Something like this:
var regex = /#[a-z0-9]+/gi;
var matches = tweet.match(regex);
This assumes that you only want letters and numbers, if certain other characters are allowed, this regex will need to be modified.
http://jsfiddle.net/YHM87/

How to detect a series of characters in a string?

For example, I have a string:
"This is the ### example"
I would like to substring the ### out of the above string?
The number of Hash keys may vary, so I would like to find out and replace the ### pattern with, say, 001 for example.
Can anybody help?
You can also do a replace. I am familiar with the C# version of this,
string stringValue = "Thia is the ### example";
stringValue.Replace("###", "");
This would remove ### completely from the above string. Again you would have to know the exact string.
In JavaScript, it's similar - .replace (with a lowercase r) is used. So:
var stringValue = "This is the ### example";
var replacedValue = stringValue.replace('###', '');
You'll want to investigate either "Regular Expressions" for this, or, if you know the precise position and length of the characters you are interested in, you can simply use String's .substring method.
If you want to capture multiple # characters, then you'll need regular expressions:
var myString = "This is #### the example";
var result = myString.replace(/#+/g, '');
If you want to remove the space too, you can use the regex /#+\s|\s#+|#+/.
If the rest of the string is known, just get the part that you need:
var example = str.substr(12, str.length - 20);
The javascript match method will return an array of substrings matching a regular expression. You can use this to determine the number of matching characters to be replaced. Assuming you want to replace each octothorpe with a random digit, you could use code like this:
var exampleStr = "This is the ### example";
var swapThese = exampleStr.match(/#/g);
if (swapThese) {
for (var i=0;i<swapThese.length;i++) {
var swapThis = new RegExp(swapThese[i]);
exampleStr = exampleStr.replace(swapThis,Math.floor(Math.random()*9));
}
}
alert(exampleStr); // or whatever you want to do with it
Note that the code only loops the length of the array if it's present: if (swapThese) {
This check is necessary because if the match method finds no matches, it returns null rather than an empty array. Trying to iterate through null value will break.

Categories

Resources