javascript - split without losing the separator [duplicate] - javascript

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
JavaScript Split without losing character
I have a string:
"<foo>abcdefg</bar><foo>abcdefg</bar><foo>abcdefg</bar><foo>abcdefg</bar>"
I want to separate all instances of "abcdefg" into an array like this:
["<foo>abcdefg</bar>", "<foo>abcdefg</bar>", "<foo>abcdefg</bar>", "<foo>abcdefg</bar>"];
I try:
var str="<foo>abcdefg</bar><foo>abcdefg</bar><foo>abcdefg</bar><foo>abcdefg</bar>";
var Array_Of_FooBars = str.split("</bar>");
alert(Array_Of_FooBars);
But it returns:
["<foo>abcdefg", "<foo>abcdefg", "<foo>abcdefg", "<foo>abcdefg",]
It is removing the separator ''. I don't want that.
How can I use split and not lose the separators from the string?
Thanks.
Ken

Try this. It's not a perfect solution, but it should work in most cases.
str.split(/(?=<foo>)/)
That is, split it in the position before each opening tag.
EDIT: You could also do it with match(), like so:
str.match(/<foo>.*?<\/bar>/g)

It seems that you would most likely want to use match:
var s = "<foo>abcd1efg</bar><foo>abc2defg</bar><foo>abc3defg</bar><foo>abc4defg</bar>"
s.match(/(<foo>.+?<\/bar>)/g)
// =>["<foo>abcd1efg</bar>", "<foo>abc2defg</bar>", "<foo>abc3defg</bar>", "<foo>abc4defg</bar>"]

You could just iterate over a simple regular expression and build the array that way:
var x = new RegExp('<foo>(.*?)</bar>', 'ig'),
s = "<foo>abcdefg</bar><foo>abcdefg</bar><foo>abcdefg</bar><foo>abcdefg</bar>",
matches = [];
while (i = x.exec(s)) {
matches.push(i[0]);
}
Just realized using String.match() would be better; this code would be more useful for matching the contents inside the tags.

Use positive lookahead so that the regular expression asserts that the special character exists, but does not actually match it:
string.split(/<br \/>(?=&#?[a-zA-Z0-9]+;)/g);

Related

How do I insert something at a specific character with Regex in Javascript [duplicate]

This question already has answers here:
Simple javascript find and replace
(6 answers)
Closed 5 years ago.
I have string "foo?bar" and I want to insert "baz" at the ?. This ? may not always be at the 3 index, so I always want to insert something string at this ? char to get "foo?bazbar"
The String.protype.replace method is perfect for this.
Example
let result = "foo?bar".replace(/\?/, '?baz');
alert(result);
I have used a RegEx in this example as requested, although you could do it without RegEx too.
Additional notes.
If you expect the string "foo?bar?boo" to result in "foo?bazbar?boo" the above code works as-is
If you expect the string "foo?bar?boo" to result in "foo?bazbar?bazboo" you can change the call to .replace(/\?/g, '?baz')
You don't need a regular expression, since you're not matching a pattern, just ordinary string replacement.
string = 'foo?bar';
newString = string.replace('?', '?baz');
console.log(newString);

Cut off extension of filename [duplicate]

This question already has answers here:
Regex for everything before last forward or backward slash
(3 answers)
Closed 5 years ago.
I have a list of filenames like
index.min.html
index.dev.html
index.min.js
index.dev.js
There.are.also.files.with.multiple.dots.and.other.extension
I want to cut off the extensions of the filenames, but the problem is that I can only use match for this task.
I tried many regular expressions looking like "index.min.html".match( /^((?!:(\.[^\.]+$)).+)/gi ); to select the filename without the last dot and extension, but they selected either the hole filename, nothing or the part before the first dot. Is there a way to select only the filename without extension?
Why regex? Simple substring expressions make this a lot simpler:
var filename = 'index.something.js.html';
alert(filename.substr(0, filename.lastIndexOf(".")));
I'd go for
/(.+)\..+$/mi
demo # regex101
See the demo, especially the matches. It only gives you the filename without the last . and the characters afterwards.
How about this one: (.*)\.[^\.]+
See http://regex101.com/r/xI6qM0
A simpler solution would be to just slice off the last element:
var a = "index.min.html";
var b = a.split('.').slice(0, -1).join('.');
Or, even better, using JavaScript's String function substr:
var b = a.substr(0, a.lastIndexOf("."));
Why do you have to use match?
Could do the trick, too:
function baseName(str) {
if (typeof str !== 'string') return;
var frags = str.split('.')
return frags.splice(0,frags.length-1).join('.');
}
Repl:http://repl.it/OvI
jsPerf: http://jsperf.com/string-extension-splits
Result:
substr is the fastest of all options in this thread. Kudos to the other guys.

replace '\n' in javascript [duplicate]

This question already has answers here:
How do I replace all occurrences of a string in JavaScript?
(78 answers)
Fastest method to replace all instances of a character in a string [duplicate]
(14 answers)
Closed 1 year ago.
I'm trying to do replace in JavaScript using:
r = "I\nam\nhere";
s = r.replace("\n"," ");
But instead of giving me
I am here
as the value of s,
It returns the same.
Where's the problem??
As stated by the others the global flag is missing for your regular expression. The correct expression should be some thing like what the others gave you.
var r = "I\nam\nhere";
var s = r.replace(/\n/g,' ');
I would like to point out the difference from what was going on from the start.
you were using the following statements
var r = "I\nam\nhere";
var s = r.replace("\n"," ");
The statements are indeed correct and will replace one instance of the character \n. It uses a different algorithm. When giving a String to replace it will look for the first occurrence and simply replace it with the string given as second argument. When using regular expressions we are not just looking for the character to match we can write complicated matching syntax and if a match or several are found then it will be replaced. More on regular expressions for JavaScript can be found here w3schools.
For instance the method you made could be made more general to parse input from several different types of files. Due to differences in Operating system it is quite common to have files with \n or \r where a new line is required. To be able to handle both your code could be rewritten using some features of regular expressions.
var r = "I\ram\nhere";
var s = r.replace(/[\n\r]/g,' ');
use s = r.replace(/\\n/g," ");
Get a reference:
The "g" in the javascript replace code stands for "greedy" which means the replacement should happen more than once if possible
The problem is that you need to use the g flag to replace all matches, as, by default, replace() only acts on the first match it finds:
var r = "I\nam\nhere",
s = r.replace(/\n/g,' ');
To use the g flag, though, you'll have to use the regular expression approach.
Incidentally, when declaring variables please use var, otherwise the variables you create are all global, which can lead to problems later on.
.replace() needs the global match flag:
s = r.replace(/\n/g, " ");
It's working for me:
var s = r.split('\\n').join(' ');
replaceAll() is relative new, not supported in all browsers:
r = "I\nam\nhere";
s = r.replaceAll("\n"," ");
You can use:
var s = r.replace(/\n/g,' ').replace(/\r/g,' ');
because diferents SO use diferents ways to set a "new line", for example: Mac Unix Windows, after this, you can use other function to normalize white spaces.
Just use \\\n to replace it will work.
r.replace("\\\n"," ");
The solution from here worked perfect for me:
r.replace(/=(\r\n|\n|\r)/gm," ");

ignore accent in regex [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to ignore acute accent in a javascript regex match?
I have some javascript as :
var myString = 'préposition_preposition';
var regex = new RegExp("epo", "ig");
alert(myString.match(regex));
is it possible to match "épo" and "epo", if I set in regex only epo (or only épo)?
I had the same problem recently. Regex operates with ascii, therefor special characters like é or ß are not recognized. You need to explicitely include those into your regex.
Use this:
var regex = /[ée]po/gi;
Hint: Don't use new Regex() it's rather slow, but declare the regex directly instead. This also solves some quoting/escaping issues.
No you can not achieve this behavior. RegEx match exactly the string you provided. How should the computer know when épo or epo is what you are looking for!
But you can specify a class of chracters that can be matched new RegExp("[eé]po", "ig");
Try this:
var str = 'préposition_preposition';
str.match(/(e|é)po/gi);

Regular Expression only returning first result found [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How can I match multiple occurrences with a regex in JavaScript similar to PHP’s preg_match_all()?
I am trying to parse an xml document like this:
var str = data.match("<string>" + "(.*?)" + "</string>");
console.log(str);
I want to get all the elements between the [string] in an array but for some reason, it only returns the first string element found. Im not good with regular expressions so Im thinking this is just a small regex issue.
You want it to be global g
var str="<string>1</string><string>2</string><string>3</string>";
var n=str.match(/<string>(.*?)<\/string>/g);
//1,2,3
You have to form the RegEx adding a g to it like
/Regex/g

Categories

Resources