Changing rows to columns in JavaScript - javascript

Basically, we have a .txt-file containing a table that is tab separated like the following one (___ represents a tab):
Name___Phone___City
Person1___111-111-1111___City1
Person2___222-222-2222___City2
We want to output another .txt-file that should contain the following:
Name, Person1, Person2
Phone, 111-111-1111, 222-222-2222
City, City1, City2
I tired using string.split() but couldn't produce the desired result. How can I do the above presented transformation with JavaScript?

Again no clue what language you're using, here's a one liner that works from any terminal as long as perl is installed (Unix based systems automatically, available on windows as well through download)
perl -pi -e 's/\t/, /g' /path/to/file

The first thing I would do is get the lines using this Regex /(.+)\n?/. Then, in each line, I'd split & join on the three underscores.
function reorder(input){
var regex = /(.+)\n?/gm, m, output = '';
while (m = regex.exec(input)) output += m[1].split('___').join(', ') + '\n';
return output;
};
Not sure you can write to a file in JavaScript though; maybe in Node.js?

Related

can String.prototype.split() differentiate between two instances of a character

I am working with rather large csv files (not mine, I cannot change the formatting of the files).
My script reads the files into a string, and then turns it into array by using .split() method first to split the rows using "\n".
The delimitator for the rows is the comma (",").
The problem is that the csv file is written to include the commas inside some of the values like so:
Type,Class,Result\n
AA,SG26,27%\n
AC,DC747,17%\n
"FF,RF",R$%,89%\n
HE,RT,56%\n
My function treats them as separate values, since it depends on split() with the delimitator of "," so it splits all the values like csv[2][Type] in this example into two.
I have tried using the replace function before splitting the string like so:
String.prototype.processCSV = function(delimiter = ","){
var str;
if(this.includes('"'){
str = this.replace(/"\s,\s"/g, "");
}
//rest of the function
}
But I do not see any results of doing that.
Is there any way to differentiate between the commas in the values and the separating commas, or any better way to read csv into arrays (please note that the array is then mapped so I can access the values by keys)?
Thank you in advance.
Edit: I should add that the project is on a static page that loads the csv files into strings first with ajax xmlhttpresponse, not in the node, due to project requirements I cannot establish a node backend.
Don't just split on ,. That's not the correct way to handle CSV files. Use a real CSV parser.
There are lots of CSV parsers on npm. Here's an example using Papaparse (npm or official home page):
var results = Papa.parse(csv, {
header: true
});
console.log(results[0].Type); // prints AA
console.log(results[0].Class); // prints SG26
console.log(results[0].Result); // prints 27%
The only real way to "distinguish commas at different positions" is to parse the string, and process the characters between " differently.
const input = `"1,1","2,2"`;
let pos = 0;
while (pos < input.length) {
switch (input[pos]) {
case `,`:
// handle comma
break;
case `\n`:
// handle newline
break;
case `"`:
const end = input.indexOf(`"`, pos + 1);
// handle string
// skip processing the string
pos = end;
break;
}
pos += 1;
}
But instead of writing your own parser (which is a fun exercise though) it is probably a good idea to use an existing implementation instead.

Get null, empty or undefined for file type extensions if there is no extension of file exists on the server

I have got scenario in which the files may not have extensions for some of it as for example:
http://www.example.com/uploads/abc
When I am using split and pop as .split('.').pop(), it is working fine for http://www.example.com/uploads/def.png, http://www.example.com/uploads/xyz.pdf and returning the correct extensions of the files But for http://www.example.com/uploads/abc, I am getting com/uploads/abc which is not intendent. I have tried slice, regular expressions as well but not working as designed.
Is it possible to have NULL, EMPTY STRING or UNDEFINED for such scenario in the variable for file extensions ?
Thanks
You can split from the last slash, and then split on the dot in the item after the last slash.
If that length is 0 then you know it doesn't have an extention
Here a simple solution. You split your string up by /, reverse it and select the first element. This will always be abc or abc.pdf. Now you need to split it up again with . and select again the first element and its done
let string = "http://www.example.com/uploads/abc";
let string2 = "http://www.example.com/uploads/abc.pdf";
let string3 = "http://www.example.com/uploads/abc.pdf.dot.more.dots";
function getExt(str){
let last = str.split("/").reverse()[0]
return last.includes(".") ? last.split(".").reverse()[0] : undefined;
}
console.log(getExt(string));
console.log(getExt(string2));
console.log(getExt(string3));

Parsing file names with javascript

I have file names like the following:
SEM_VSE_SKINSHARPS_555001881_181002_1559_37072093.DAT
SEM_VSE_SECURITY_555001881_181002_1559_37072093.DAT
SEM_VSE_MEDICALCONDEMERGENCIES_555001881_181002_1559_37072093.DAT
SEM_REASONS_555001881_181002_1414_37072093.DAT
SEM_PSE_NPI_SECURITY_555001881_181002_1412_37072093.DAT
and I need to strip the numbers from the end. This will happen daily and and the numbers will change. I HAVE to do it in javascript. The problem is, I know really nothing about javascript. I've looked at both split and slice and I'm not sure either will work. These files come from a government entity which means the file name will probably not be consistent.
expected output:
SEM_VSE_SKINSHARPS
SEM_VSE_SECURITY
SEM_VSE_MEDICALCONDEMERGENCIES
SEM_REASONS
SEM_PSE_NPI_SECURITY
Any help is greatly appreciated.
This is a good use case for regular expressions. For example,
var oldFileName = 'SEM_VSE_SKINSHARPS_555001881_181002_1559_37072093.DAT',
newFileName;
newFileName = oldFileName.replace(/[_0-9]+(?=.DAT$)/, ''); // SEM_VSE_SKINSHARPS.DAT
This says to replace as many characters as it can in the set - and 0-9, with the requirement that the replaced portion must be followed by .DAT and the end of the string.
If you want to strip the .DAT, as well, use /[_0-9]+.DAT$/ as the regular expression instead of the one above.
If all the files end in .XYZ and follow the given pattern, this might also work:
var filename = "SEM_VSE_SKINSHARPS_555001881_181002_1559_37072093.DAT"
filename.slice(0,-4).split("_").filter(x => !+x).join("_")
results in:
"SEM_VSE_SKINSHARPS"
This is how it works:
drop the last 4 chars (.DAT)
split by _
filter out the numbers
join what is remaining with another _
You can also create a function out of this solution (or the other ones) and use it to process all the files provided they are in an array:
var fileTrimmer = filename => filename.slice(0,-4).split("_").filter(x => !+x).join("_")
var result = array_of_filenames.map(fileTrimmer)
Below is a solution that assumes you have your file name strings stored in an array. The code below simply creates a new array of properly formatted file names by utilizing Array.prototype.map on the original array - the map callback function first grabs the extension part of the string to tack on the file name later. Next, the function breaks the fileName string into an array delimited on the _ character. Finally, the filter function returns true if it does not find a number within the fileName string - returning true means that the element will be part of the new array. Otherwise, filter will return false and will not include the portion of the string that contains a number.
var fileNames = ['SEM_VSE_SKINSHARPS_555001881_181002_1559_37072093.DAT', 'SEM_VSE_SECURITY_555001881_181002_1559_37072093.DAT', 'SEM_VSE_MEDICALCONDEMERGENCIES_555001881_181002_1559_37072093.DAT', 'SEM_REASONS_555001881_181002_1414_37072093.DAT', 'SEM_PSE_NPI_SECURITY_555001881_181002_1412_37072093.DAT'];
var formattedFileNames = fileNames.map(fileName => {
var ext = fileName.substring(fileName.indexOf('.'), fileName.length);
var parts = fileName.split('_');
return parts.filter(part => !part.match(/[0-9]/g)).join('_') + ext;
});
console.log(formattedFileNames);

how to split list of emails with javascript split

I am having trouble with javascript split method. I would like some code to 'split' up a list of emails.
example: test#test.comfish#fish.comnone#none.com
how do you split that up?
Regardless of programming language, you will need to write (create) artificial intelligence which will recognize emails (since there is no pattern).
But since you are asking how to do it, I assume that you need really simple solution. In that case split text based on .com, .net, .org ...
This is easy to do, but it will generate probably a lot of invalid emails.
UPDATE: Here is code example for simple solution (please note that this will work only for all domains that end with 3 letter like: .com, .net, .org, .biz...):
var emails = "test#test.comfish#fish.comnone#none.com"
var emailsArray = new Array()
while (emails !== '')
{
//ensures that dot is searched after # symbol (so it can find this email as well: test.test#test.com)
//adding 4 characters makes up for dot + TLD ('.com'.length === 4)
var endOfEmail = emails.indexOf('.', emails.indexOf('#')) + 4
var tmpEmail = emails.substring(0, endOfEmail)
emails = emails.substring(endOfEmail)
emailsArray.push(tmpEmail)
}
alert(emailsArray)
This code has downsides of course:
It won't work for other then 3-char's TLS's
It won't work if domain has subdomain, like test#test.test.com
But I believe that it has best time_to_do_it/percent_of_valid_emails ratio due to very very little time needed to make it.
Assuming you have different domains, like .com, .net etc and can't just split on .com, AND assuming your domain names and recipient names are the same like in each of your three examples, you might be able to do something crazy like this:
var emails = "test#test.comfish#fish.comnone#none.com"
// get the string between # and . to get the domain name
var domain = emails.substring(emails.lastIndexOf("#")+1,emails.lastIndexOf("."));
// split the string on the index before "domain#"
var last_email = split_on(emails, emails.indexOf( domain + "#" ) );
function split_on(value, index) {
return value.substring(0, index) + "," + value.substring(index);
}
// this gives the first emails together and splits "none#none.com"
// I'd loop through repeating this sort of process but moving in the
// index of the length of the email, so that you split the inner emails too
alert(last_email);
>>> test#test.comfish#fish.com, none#none.com

Is there any way for me to work with this 100,000 item new-line separated string of words?

I've got a 100,000+ long list of English words in plain text. I want to use split() to convert the list into an array, which I can then convert to an associative array, giving each list item a key equal to its own name, so I can very efficiently check whether or not a string is an English word.
Here's the problem:
The list is new-line separated.
aa
aah
aahed
aahing
aahs
aal
aalii
aaliis
aals
This means that var list = ' <copy/paste list> ' isn't going to work, because JavaScript quotes don't work multi-line.
Is there any way for me to work with this 100,000 item new-line separated string?
replace the newlines with commas in any texteditor before copying to your js file
One workaround would be to use paste the list into notepad++. Then select all and Edit>Line Operations>Join lines.
This removes new lines and replaces them with spaces.
If you're doing this client side, you can use jQuery's get function to get the words from a text file and do the processing there:
jQuery.get('wordlist.txt', function(results){
//Do your processing on results here
});
If you're doing this in Node.js, follow the guide here to see how to read a file into memory.
You can use notepad++ or any semi-advanced text editor.
Go to notepad++ and push Ctrl+H to bring up the Replace dialog.
Towards the bottom, select the "Extended" Search Mode
You want to find "\r\n" and replace it with ", "
This will remove the newlines and replace it with commas
jsfiddle Demo
Addressing this purely from having a string and trying to work with it in JavaScript through copy paste. Specifically the issues regarding, "This means that var list = ' ' isn't going to work, because JavaScript quotes don't work multi-line.", and "Is there any way for me to work with this 100,000 item new-line separated string?".
You can treat the string like a string in a comment in JavaScript . Although counter-intuitive, this is an interesting approach. Here is the main function
function convertComment(c) {
return c.toString().
replace(/^[^\/]+\/\*!?/, '').
replace(/\*\/[^\/]+$/, '');
}
It can be used in your situation as follows:
var s = convertComment(function() {
/*
aa
aah
aahed
aahing
aahs
aal
aalii
aaliis
aals
*/
});
At which point you may do whatever you like with s. The demo simply places it into a div for displaying.
jsFiddle Demo
Further, here is an example of taking the list of words, getting them into an array, and then referencing a single word in the array.
//previously shown code
var all = s.match(/[^\r\n]+/g);
var rand = parseInt(Math.random() * all.length);
document.getElementById("random").innerHTML = "Random index #"+rand+": "+all[rand];
If the words are in a separate file, you can load them directly into the page and go from there. I've used a script element with a MIME type that should mean browsers ignore the content (provided it's in the head):
<script type="text/plain" id="wordlist">
aa
aah
aahed
aahing
aahs
aal
aalii
aaliis
aals
</script>
<script>
var words = (function() {
var words = '\n' + document.getElementById('wordlist').textContent + '\n';
return {
checkWord: function (word) {
return words.indexOf('\n' + word + '\n') != -1;
}
}
}());
console.log(words.checkWord('aaliis')); // true
console.log(words.checkWord('ahh')); // false
</script>
The result is an object with one method, checkWord, that has access to the word list in a closure. You could add more methods like addWord or addVariant, whatever.
Note that textContent may not be supported in all browsers, you may need to feature detect and use innerText or an alternative for some.
For variety, another solution is to put the unaltered content into
A data attribute - HTML attributes can contain newlines
or a "non-script" script - eg. <SCRIPT TYPE="text/x-wordlist">
or an HTML comment node
or another hidden element that allows content
Then the content could be read and split/parsed. Since this would be done outside of JavaScript's string literal parsing it doesn't have the issue regarding embedded newlines.

Categories

Resources