Getting each word of a line which is separated by tabs - javascript

let line = '1 test test#gmail.com';
let column = line.split('\t');
console.log(column[0] + '-' + column[1] + '-' + column[2]);
output I'm getting back:
1 test test#gmail.com-undefined-undefined
Expected output:
1-test-test#gmail.com
How can we achieve this?

You can use replace or split - I would trim the line first to be sure
Note, your line did not contain tabs. The \s+ is any number of whitespace characters:
\s Matches a single white space character, including space, tab, form feed, line feed, and other Unicode spaces. Equivalent to [ \f\n\r\t\v\u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff].
let line = ' 1 test test#gmail.com ';
let columns = line.trim().split(/\s+/) // .join("-")
console.log(columns); // in case you need them in an array
let combined = columns.join("-");
console.log(combined);

No need to split and join (you could have used column.join('-') for that), simply replace by using a regex. \s+ matches one or more whitespace characters. You can also use \t if that's enough in your case. /g makes sure it replaces all occurences instead of only one.
let line = '1 test test#gmail.com';
let column = line.replace(/\s+/g, '-');
console.log(column);

Related

How to achieve this result using Regex?

Given the input below, what's the regex expression that gives the desired output in javascript? I must achieve this without using a multiline flag.
input
\n
\n
abc def.\n
\n
*\n
\n
desired output (maintain same number of rows but insert = into blank rows)
=\n
=\n
abc def.\n
=\n
*\n
=\n
actual output (using regex /[^a-zA-Z0-9.*]+\n/ replaced with =\n; it somehow removes one of two consecutive `\n`s)
=\n
abc def.=\n
*=\n
You could try a combination of replace functions like so:
str = "\n\nabc def.\n\n*\n\n";
str = str.replace(/\n/g, "=\n");
str = str.replace(/(.)=\n/g, "$1\n");
console.log(str);
Explanation -
After the first replacement/s, the output looks like:
=
=
abc def.=
=
*=
=
Then, you replace any characters followed by a =\n and replace it with that same character (given by $1), followed by a newline.
Your desired outcome is "maintain same number of rows but insert = into blank rows".
An empty ("blank") row is a row that matches the regex: ^$.
^ means the beginning of the input string, $ means the end of the input string but if the m modifier is specified (it means "multi-line"), ^ matches the beginning of a line and $ matches the end of a line.
Your code should be as simple as:
input = "\n\nabc def.\n\n*\n\n";
output = str.replace(/^$/mg, '=');
The m modifier changes the meaning of ^ and $ as explained above. The newline characters are not matched by the regex above and consequently they do not need to be present in the replacement string.
The g modifier tells String.replace() to find and replace all the matching substrings, not only the first one (the default behaviour of String.replace()).
Read more about regular expressions in JavaScript.
This should work with two replace :
value.replace(/^\n/, '=\n').replace(/\n\n/g, '\n=\n')
The first replace takes care of the first line if it starts with a blank row.
The second replace takes care of other lines : adding = in blank rows is the same than inserting = between two consecutives \n

Replace all spaces except the first & last ones

I want to replace all whitespaces except the first and the last one as this image.
How to replace only the red ones?
how to replace the red space
I tried :
.replace(/\s/g, "_");
but it captures all spaces.
I suggest match and capture the initial/trailing whitespaces that will be kept and then matching any other whitespace that will be replaced with _:
var s = " One Two There ";
console.log(
s.replace(/(^\s+|\s+$)|\s/g, function($0,$1) {
return $1 ? $1 : '_';
})
);
Here,
(^\s+|\s+$) - Group 1: either one or more whitespaces at the start or end of the string
| - or
\s - any other whitespace.
The $0 in the callback method represents the whole match, and $1 is the argument holding the contents of Group 1. Once $1 matches, we return its contents, else, replace with _.
You can use ^ to check for first character and $ for last, in other words, search for space that is either preceded by something other than start of line, or followed by something thing other than end of line:
var rgx = /(?!^)(\s)(?!$)/g;
// (?!^) => not start of line
// (?!$) => not end of line
console.log(' One Two Three '.replace(rgx, "_"));

How to replace all \n with space? [duplicate]

I have a var that contains a big list of words (millions) in this format:
var words = "
car
house
home
computer
go
went
";
I want to make a function that will replace the newline between each word with space.
So the results would something look like this:
car house home computer go went
You can use the .replace() function:
words = words.replace(/\n/g, " ");
Note that you need the g flag on the regular expression to get replace to replace all the newlines with a space rather than just the first one.
Also, note that you have to assign the result of the .replace() to a variable because it returns a new string. It does not modify the existing string. Strings in Javascript are immutable (they aren't directly modified) so any modification operation on a string like .slice(), .concat(), .replace(), etc... returns a new string.
let words = "a\nb\nc\nd\ne";
console.log("Before:");
console.log(words);
words = words.replace(/\n/g, " ");
console.log("After:");
console.log(words);
In case there are multiple line breaks (newline symbols) and if there can be both \r or \n, and you need to replace all subsequent linebreaks with one space, use
var new_words = words.replace(/[\r\n]+/g," ");
See regex demo
To match all Unicode line break characters and replace/remove them, add \x0B\x0C\u0085\u2028\u2029 to the above regex:
/[\r\n\x0B\x0C\u0085\u2028\u2029]+/g
The /[\r\n\x0B\x0C\u0085\u2028\u2029]+/g means:
[ - start of a positive character class matching any single char defined inside it:
\r - (\x0D) - \n] - a carriage return (CR)
\n - (\x0A) - a line feed character (LF)
\x0B - a line tabulation (LT)
\x0C - form feed (FF)
\u0085 - next line (NEL)
\u2028 - line separator (LS)
\u2029 - paragraph separator (PS)
] - end of the character class
+ - a quantifier that makes the regex engine match the previous atom (the character class here) one or more times (consecutive linebreaks are matched)
/g - find and replace all occurrences in the provided string.
var words = "car\r\n\r\nhouse\nhome\rcomputer\ngo\n\nwent";
document.body.innerHTML = "<pre>OLD:\n" + words + "</pre>";
var new_words = words.replace(/[\r\n\x0B\x0C\u0085\u2028\u2029]+/g," ");
document.body.innerHTML += "<pre>NEW:\n" + new_words + "</pre>";
Code : (FIXED)
var new_words = words.replace(/\n/g," ");
Some simple solution would look like
words.replace(/(\n)/g," ");
No need for global regex, use replaceAll instead of replace
myString.replaceAll('\n', ' ')

Javascript regex, make remove single paragraph line breaks

I've got text in this format:
word word,
word word.
word word
word word.
Not specific to that two word format, it's just a line break before so many characters, rather than one long string of paragraph. But I'm trying to get it to be that one long string of paragraph. So it should look like this:
word word, word word.
word word word word.
If I use the code text.replace(/$\n(?=.)/gm, " ") and output that to the terminal I get text that looks like:
word word, word word.
word word word word.
It's got an extra space at the start of the paragraph, but that's good enough for what I'm trying to do (although if there's also a way to remove it in one replace function than that's good). The problem is that when I output it to a textarea it doesn't remove the \n character, and I just get text that looks like this:
word word,
word word.
word word
word word.
I'm trying to do this all client side, currently running it in Firefox.
I'm not the best with regex, so this might be really simple and I'm just ignorant on how to do it. But any help would be really appreciated. Thanks!
A carriage return is \r so you would need to use
text.replace(/$(\r|\n)(?=.)/gm, " ");
Below a snippet of code that satisfy your request, i've removed the leading whitespaces too (caused by empty lines), using a closure with the replace function:
var regex = /([^.])\s+/g;
var input = 'word word,\nword word.\n\nword word\nword word.';
var result = input.replace(regex, function(all, char) {
return (char.match(/\s/)) ? char : char + ' ' ;
});
document.write('<b>INPUT</b> <xmp>' + input + '</xmp>');
document.write('<b>OUTPUT</b> <xmp>' + result + '</xmp>');
Regex Breakout
([^.]) # Select any char that is not a literal dot '.'
# and save it in group $1
\s+ # 1 or more whitespace char, remove trailing spaces (tabs too)
# and all type of newlines (\r\n, \r, \n)
NOTE
if for some reason you want to keep the leading whitespace, simplify the code below as follow:
var regex = /([^.])\s+/g;
var replace = '$1 ';
var input = 'word word,\nword word.\n\nword word\nword word.';
var result = input.replace(regex, replace);
document.write('<b>INPUT</b> <xmp>' + input + '</xmp>');
document.write('<b>OUTPUT</b> <xmp>' + result + '</xmp>');
You probably missed some \r, here's a way to match all sort of new lines and not have extra spaces:
var input = 'word word,\nword word.\n\nword word\nword word.';
// split if 2 or more new lines
var out = input.split(/(\r\n|\n|\r){2,}?/)
// split the paragraph by new lines and join the lines by a space
.map((v) => v.split(/\r\n|\n|\r/).join(' '))
// there is some spaces hanging in the array, filter them
.filter((v) => v.trim())
// join together all paragraphs by \n
.join('\n');
$('#txt').append(out);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<textarea id="txt"></textarea>

Remove empty values from comma separated string javascript

How do I remove empty values from an comma separated string in JavaScript/jQuery?
Is there a straightforward way, or do I need to loop through it and remove them manually?
Is there a way to merge all the splits (str and str1) in JavaScript/jQuery?
CODE:
var str = '+ a + "|" + b';
var str1 = '+ a + "-" + b';
str = str.split("+").join(",").split('"|"').join(",");
str1 = str1.split("+").join(",").split('"-"').join(",");
console.log(str); //, a , , , b
console.log(str1); //, a , , , b
EXPECTED OUTPUT :
a,b
Help would be appreciated :)
As I see it, you want to remove +, "|", "-" and whitespace from the beginning and end of the string, and want to replace those within the string with a single comma. Here's three regexes to do that:
str = str.replace(/^(?:[\s+]|"[|-]")+/, '')
.replace(/(?:[\s+]|"[|-]")+$/, '')
.replace(/(?:[\s+]|"[|-]")+/g, ',');
The (?:[\s+]|"[|-]") matches whitespace or pluses, or "|" or "-". The + at the end repeats it one or more times. In the first expression we anchor the match to the beginning of the string and replace it with nothing (i.e. remove it). In the second expression we anchor the match to the end of the string and remove it. And in the third, there is no anchor, because all matches that are left have to be somewhere inside the string - and we replace those with ,. Note the g modifier for the last expression - without it only the first match would be replaced.
The other answer is useful, and may be exactly what you are looking for.
If, for some reason, you still want to use split, luckily that method takes a regex as separator, too:
str = str.split(/\s*\+\s*(?:"\|"\s*\+\s*)?/).slice(1).join(",");
str1 = str1.split(/\s*\+\s*(?:"-"\s*\+\s*)?/).slice(1).join(",");
Because you have a plus sign in front of the "a", you can slice the array to return only the elements after it.
Also, since you mentioned you were new to regular expressions, here is the explanation:
any amount of space
a plus sign
any amount of space
optional (because of the ? after the group, which is the parentheses): a non-capturing (that is what the ?: means) group containing:
"|"
any amount of space
another plus sign
any amount of space
Works perfectly fine:
str.split(/[ ,]+/).filter(function(v){return v!==''}).join(',')

Categories

Resources