I have a directory with space on unix, so the space is backslashed. And I need to replace the backslashed space with semicolon. Tried multiple regex'es but not able to find the answer
var str = '/test\ space/a.sh -pqr';
So I am looking to get this after the replace /test;space/a.sh -pqr
console.log("replace: ", str.replace(/\\\s+/g, ";")); //This one doesn't work, (formatting is taking out one backslash)
Your regular expression is correct.
It's your example string that is incorrect - the \ is not properly escaped:
var str = '/test\\ space/a.sh -pqr';
See the fiddle and read more special characters in JavaScript strings.
Related
I am trying to split a string in JS on spaces except when the space is in a quote. However, an incomplete quote should be maintained. I'm not skilled in regex wizardry, and have been using the below regex:
var list = text.match(/[^\s"]+|"([^"]*)"/g)
However, if I provide input like sdfj "sdfjjk this will become ["sdfj","sdfjjk"] rather than ["sdfj",""sdfjjk"].
You can use
var re = /"([^"]*)"|\S+/g;
By using \S (=[^\s]) we just drop the " from the negated character class.
By placing the "([^"]*)" pattern before \S+, we make sure substrings in quotes are not torn if they come before. This should work if the string contains well-paired quoted substrings and the last is unpaired.
Demo:
var re = /"([^"]*)"|\S+/g;
var str = 'sdfj "sdfjjk';
document.body.innerHTML = JSON.stringify(str.match(re));
Note that to get the captured texts in-between quotes, you will need to use RegExp#exec in a loop (as String#match "drops" submatches).
UPDATE
No idea what downvoter thought when downvoting, but let me guess. The quotes are usually used around word characters. If there is a "wild" quote, it is still a quote right before/after a word.
So, we can utilize word boundaries like this:
"\b[^"]*\b"|\S+
See regex demo.
Here, "\b[^"]*\b" matches a " that is followed by a word character, then matches zero or more characters other than " and then is followed with a " that is preceded with a word character.
Moving further in this direction, we can make it as far as:
\B"\b[^"\n]*\b"\B|\S+
With \B" we require that " should be preceded with a non-word character, and "\B should be followed with a non-word character.
See another regex demo
A lot depends on what specific issue you have with your specific input!
Try the following:
text.match(/".*?"|[^\s]+/g).map(s => s.replace(/^"(.*)"$/, "$1"))
This repeatedly finds either properly quoted substrings (first), OR other sequences of non-whitespace. The map part is to remove the quotes around the quoted substrings.
> text = 'abc "def ghi" lmn "opq'
< ["abc", "def ghi", "lmn", ""opq"]
I'm using a tiny little JS plugin to truncate multiple lines of text on a site I'm working on.
The only problem is that the script is counting HTML tags for example in the character count which is throwing things off a little.
This is how the script currently excludes characters;
regex = /[!-\/:-#\[-`{-~]$/
Which basically just strips out certain punctuation characters.
I've tried changing it to this;
regex = [!-\/:-#\[-`{-~]$<[^>]*>
But, not being too familiar with regex, it didn't seem to work.
If someone could nudge me in the right direction that would be great.
In your initial regex you're looking for single characters that matches the tail of the string - either it be a character, word, line. Note the dollar sign '$'.
regex = /[!-\/:-#\[-`{-~]$/
Now you want to match anything between < and >.
regex = /[!-\/:-#\[-`{-~]$|<[^>]*$/
Note that you'll match: <, <aaaa, <aaaa< until the end of the string that you are matching against.
greedy_regex = /[!-\/:-#\[-`{-~]$|<[^>]*/
non_greedy_regex = /[!-\/:-#\[-`{-~]$|<[^>]*?/
If you remove the second '$' - greedy_regex - it will do a greedy match, matching <b>c</b> of a<b>c</b>d. Using the ? as in non_greedy_regex it will match the '` only.
i've a question about regex, i've a text and it looks like below :
car,model,serie
,Mercedes,324,1,
,BMW,23423,1,
,OPEL,54322,1,
it should look like:
car,model,serie
Mercedes,324,1,
BMW,23423,1,
OPEL,54322,1,
so without commas at the beginning of the text.
What i tried :
var str2 = str.replace(/\n|\r/g, "");
but somehow, i couldn't add comma in regex.
can anyone help me?
Thanks in advance.
There have been a lot of responses to this question and for a newbie to regex it is probably a bit overwelming,
Overall the best response has been:
var str2 = str.replace(/^,/gm, '');
This works by using ^, to check if the first character is a comma and if it is, remove it. It also uses the g and m flags to do this for the first character of every line.
If you are curious about the other versions then read on:
1:
var str2 = str.replace(/^,+/gm, '');
This is a slight variant in that it will remove multiple consecutive commas at the beginning of each line, but based off of your dataset this is not required.
2:
var str2 = str.replace(/\n,/g, '\n');
This version works exactly the same as the first, however it finds each newline follow by a comma with \n, and replaces it with another newline.
3:
var str2 = str.replace(/(\n|\r),/g, '$1')
This version is the same as the previous however it doesn't make the assumption that the newline is a \n, it instead captures any newlines or carriage returns, it works the same as the m flag and ^,.
4:
var str2 = str.replace(/\n+|\r+|,+/g,"\n")
And finally there is this, this is a combination of all the previous regex's, it makes the assumption that you may have a lot mixed newlines and commas without any text, and that you would want to remove all of those characters, it is unnecessary for your examples.
Use this syntax:
str.replace(/^,/gm, '');
You can just use multiline flag and replace leading commas:
str = str.replace(/^,+/gm);
RegEx Demo
Try:
var str2 = str.replace(/(\n|\r),/g, '$1')
Your comma was actually placed outside the regex pattern, so you weren't far off :)
When I use a tool like regexpal.com it let's me use regex as I am used to. So for example I want to check a text if there is a match for a word that is at least 3 letters long and ends with a white space so it will match 'now ', 'noww ' and so on.
On regexpal.com this regex works \w{3,}\s this matches both the words above.
But on javascript I have to add double backslashes before w and s. Like this:
var regexp = new RegExp('\\w{3,}\\s','i');
or else it does not work. I looked around for answers and searched for double backslash javascript regex but all I got was completely different topics about how to escape backslash and so on. Does someone have an explanation for this?
You could write the regex without double backslash but you need to put the regex inside forward slashshes as delimiter.
/^\w{3,}\s$/.test('foo ')
Anchors ^ (matches the start of the line boundary), $ (matches the end of a line) helps to do an exact string match. You don't need an i modifier since \w matches both upper and lower case letters.
Why? Because in a string, "\" quotes the following character so "\w" is seen as "w". It essentially says "treat the next character literally and don't interpret it".
To avoid that, the "\" must be quoted too, so "\\w" is seen by the regular expression parser as "\w".
Hey. First question here, probably extremely lame, but I totally suck in regular expressions :(
I want to extract the text from a series of strings that always have only alphabetic characters before and after a hyphen:
string = "some-text"
I need to generate separate strings that include the text before AND after the hyphen. So for the example above I would need string1 = "some" and string2 = "text"
I found this and it works for the text before the hyphen, now I only need the regex for the one after the hyphen.
Thanks.
You don't need regex for that, you can just split it instead.
var myString = "some-text";
var splitWords = myString.split("-");
splitWords[0] would then be "some", and splitWords[1] will be "text".
If you actually have to use regex for whatever reason though - the $ character marks the end of a string in regex, so -(.*)$ is a regex that will match everything after the first hyphen it finds till the end of the string. That could actually be simplified that to just -(.*) too, as the .* will match till the end of the string anyway.