Paste values from one sheet to another and remove duplicates - javascript

I have two worksheets in my google spreadsheet:
Input data is coming into the Get Data worksheet via the importxml function.
However, I would like to copy all values of the Get Data sheet to the Final Data sheet and if there are duplicates(in terms of rows) append the unique row.
Here is what I tried:
function onEdit() {
//get the data from old Spreadsheet
var ss = SpreadsheetApp.openById("1bm2ia--F2b0495iTJotp4Kv1QAW-wGUGDUROwM9B-D0");
var dataRange = ss.getSheetByName("Get Data").getRange(1, 1, ss.getLastRow(), ss.getLastColumn());
var dataRangeFinalData = ss.getSheetByName("Final Data").getRange(1, 1, ss.getLastRow(), ss.getLastColumn());
var myData = dataRange.getValues();
//Open new Spreadsheet & paste the data
newSS = SpreadsheetApp.openById("1bm2ia--F2b0495iTJotp4Kv1QAW-wGUGDUROwM9B-D0");
Logger.log(newSS.getLastRow());
newSS.getSheetByName("Final Data").getRange(newSS.getLastRow()+1, 1, ss.getLastRow(), ss.getLastColumn()).setValues(myData);
//remove duplicates in the new sheet
removeDups(dataRangeFinalData)
}
function getId() {
Browser.msgBox('Spreadsheet key: ' + SpreadsheetApp.getActiveSpreadsheet().getId());
}
function removeDups(array) {
var outArray = [];
array.sort(lowerCase);
function lowerCase(a,b){
return a.toLowerCase()>b.toLowerCase() ? 1 : -1;// sort function that does not "see" letter case
}
outArray.push(array[0]);
for(var n in array){
Logger.log(outArray[outArray.length-1]+' = '+array[n]+' ?');
if(outArray[outArray.length-1].toLowerCase()!=array[n].toLowerCase()){
outArray.push(array[n]);
}
}
return outArray;
}
Below you can find the link to a sample spreadsheet:
Sample Sheet
My problem is that the data does not get pasted.
I appreciate your replies!

tl;dr: See script at bottom.
An onEdit() function is inappropriate for your use case, as cell contents modified by spreadsheet functions are not considered "edit" events. You can read more about that in this answer. If you want this to be automated, then a timed trigger function would be appropriate. Alternatively, you could manually invoke the function by a menu item, say. I'll leave that to you to decide, as the real meat of your problem is how to ensure row-level uniqueness in your final data set.
Merging unique rows
Although your original code is incomplete, it appears you were intending to first remove duplicates from the source data, utilizing case-insensitive string comparisons. I'll suggest instead that some other JavaScript magic would help here.
We're interested in uniqueness in our destination data, so we need to have a way to compare new rows to what we already have. If we had arrays of strings or numbers, then we could just use the techniques in How to merge two arrays in Javascript and de-duplicate items. However, there's a complication here, because we have an array of arrays, and arrays cannot be directly compared.
Hash
Fine - we could still compare rows element-by-element, which would require a simple loop over all columns in the rows we were comparing. Simple, but slow, what we would call an O(n2) solution (Order n-squared). As the number of rows to compare increased, the number of unique comparison operations would increase exponentially. So, let's not do that.
Instead, we'll create a separate data structure that mirrors our destination data but is very efficient for comparisons, a hash.
In JavaScript we can quickly access the properties of an object by their name, or key. Further, that key can be any string. We can create a simple hash table then, with an object whose properties are named using strings generated from the rows of our destination data. For example, this would create a hash object, then add the array row to it:
var destHash = {};
destHash[row.join('')] = true; // could be anything
To create our key, we're joining all the values in the row array with no separator. Now, to test for uniqueness of a row, we just check for existence of an object property with an identically-formed key. Like this:
var alreadyExists = destHash.hasOwnProperty(row.join(''));
One additional consideration: since the source data can conceivably contain duplicate rows that aren't yet in the destination data, we need to continuously expand the hash table as unique rows are identified.
Filter & Concatenate
JavaScript provides two built-in array methods that we'll use to filter out known rows, and concatenate only unique rows to our destination data.
In its simple form, that would look like this:
// Concatentate source rows to dest rows if they satisfy a uniqueness filter
var mergedData = destData.concat(sourceData.filter(function (row) {
// Return true if given row is unique
}));
You can read that as "create an array named mergedData that consists of the current contents of the array named destData, with filtered rows of the sourceData array concatenated to it."
You'll find in the final function that it's a little more complex due to the other considerations already mentioned.
Update spreadsheet
Once we have our mergedData array, it just needs to be written into the destination Sheet.
Padding rows: The source data contains rows of inconsistent width, which will be a problem when calling setValues(), which expects all rows to be squared off. This will require that we examine and pad rows to avoid this sort of error:
Incorrect range width, was 6 but should be 5 (line ?, file "Code")
Padding rows is done by pushing blank "cells" at the end of the row array until it reaches the intended length.
for (var col=mergedData[row].length; col<mergedWidth; col++)
mergedData[row].push('');
With that taken care of for each row, we're finally ready to write out the result.
Final script
function appendUniqueRows() {
var ss = SpreadsheetApp.getActive();
var sourceSheet = ss.getSheetByName('Get Data');
var destSheet = ss.getSheetByName('Final Data');
var sourceData = sourceSheet.getDataRange().getValues();
var destData = destSheet.getDataRange().getValues();
// Check whether destination sheet is empty
if (destData.length === 1 && "" === destData[0].join('')) {
// Empty, so ignore the phantom row
destData = [];
}
// Generate hash for comparisons
var destHash = {};
destData.forEach(function(row) {
destHash[row.join('')] = true; // could be anything
});
// Concatentate source rows to dest rows if they satisfy a uniqueness filter
var mergedData = destData.concat(sourceData.filter(function (row) {
var hashedRow = row.join('');
if (!destHash.hasOwnProperty(hashedRow)) {
// This row is unique
destHash[hashedRow] = true; // Add to hash for future comparisons
return true; // filter -> true
}
return false; // not unique, filter -> false
}));
// Check whether two data sets were the same width
var sourceWidth = (sourceData.length > 0) ? sourceData[0].length : 0;
var destWidth = (destData.length > 0) ? destData[0].length : 0;
if (sourceWidth !== destWidth) {
// Pad out all columns for the new row
var mergedWidth = Math.max(sourceWidth,destWidth);
for (var row=0; row<mergedData.length; row++) {
for (var col=mergedData[row].length; col<mergedWidth; col++)
mergedData[row].push('');
}
}
// Write merged data to destination sheet
destSheet.getRange(1, 1, mergedData.length, mergedData[0].length)
.setValues(mergedData);
}

Related

Split multiple JSON string into strucutred table using Google App Script

I am trying to split a data set with an ID and JSON string into a structured table.
The difficult part is I need it to be dynamic, the JSON string varies often and I want headings to be determined by the unique values in the input column at that time. I need the script to be able to create headings if the string changes without needed to recode the script.
We have about 150 different JSON strings we are hoping to use this script on, without recoding it for each one. Each string has lots of data points.
I have a script working but it splits them one by one, need to build something that will do bulk in one go, by looping through all outputs in B and creating a column for each unique field in all the strings, then populating them.
The script works if I paste the additional info straight in, however I am having trouble reading from the sheet
var inputsheet = SpreadsheetApp.getActive().getSheetByName("Input");
var outputsheet = SpreadsheetApp.getActive().getSheetByName("Current Output");
var additionalinfo = inputsheet.getRange(1,1).getValue()
Logger.log(additionalinfo)
var rows = [],
data;
for (i = 0; i < additionalinfo.length; i++) {
for (j in additionalinfo[i]) {
dataq = additionalinfo[i][j];
Logger.log(dataq);
rows.push([j, dataq]);
}
dataRange = outputsheet.getRange(1, 1, rows.length, 2);
dataRange.setValues(rows);
}
}
Here is a link to the sample data. Note that in Sample 1 & 2 there are different headings, we need the script to identify this and create headings for both
https://docs.google.com/spreadsheets/d/1BMiVuAgDbibLw6yUG3IZ9iw4MZTaVVegkw_k3ItQ4mU/edit#gid=0
Try this script that produces dynamic headers based on the json that has been read. It collects all json data, get its keys, and remove the duplicates.
Script:
function JSON_SPLITTER() {
var spreadsheet = SpreadsheetApp.getActive();
var inputsheet = spreadsheet .getSheetByName("Input");
var outputsheet = spreadsheet .getSheetByName("Current Output");
var additionalinfo = inputsheet.getDataRange().getValues();
var keys = [];
// prepare the additionalInfo data to be parsed for later
var data = additionalinfo.slice(1).map(row => {
// collect all keys in an array
if (JSON.parse(row[1]).additionalInfo) {
keys.push(Object.keys(JSON.parse(row[1]).additionalInfo));
return JSON.parse(row[1]).additionalInfo;
}
else {
keys.push(Object.keys(JSON.parse(row[1])));
return JSON.parse(row[1]);
}
});
// unique values of keys, modified to form header
var headers = [...new Set(keys.flat())]
// Add A1 as the header for the ids
headers.unshift(additionalinfo[0][0]);
// set A1 and keys as headers
var output = [headers]
// build output array
additionalinfo.slice(1).forEach((row, index) => {
var outputRow = [];
headers.forEach(column => {
if(column == 'Contract Oid')
outputRow.push(row[0]);
else
outputRow.push(data[index][column]);
});
output.push(outputRow)
});
outputsheet.getRange(1, 1, output.length, output[0].length).setValues(output);
}
Output:
Update:
Modified script for no-additionalInfo key objects.

Parsing Data in Google Sheets From an Object

I have thousands of rows of data in a Google Sheets File in a column that looks something like
[{"amountMax":49.99,"amountMin":49.99,"availability":"true","color":"Brown","currency":"USD","dateSeen":["2019-04-11T08:00:00Z"],"isSale":"false","offer":"Online only","sourceURLs":["https://www.walmart.com/ip/SadoTech-Model-CXR-Wireless-Doorbell-1-Remote-Button-2-Plugin-Receivers-Operating-500-feet-Range-50-Chimes-Batteries-Required-Receivers-Beige-Fixed-C/463989633"]}]
I would like to be able to return the max value, the currency, the color attributes. How can I do that in Google Sheets. Ideally would like to do something like being able to retrieve the data attributes how I would normally in javascript like in this link here https://repl.it/#alexhoy/WetSlateblueDribbleware
However this does not seem to work for me when creating a function in script.google.com
For example, here is a slugify function which takes an input (cell) and turns it into a slug/handle without the need for looping. In Google Sheets I can then call =slugify(b2) and turn that value into slug form
/**
* Converts value to slug
* #customfunction
*/
function slugify(value) {
/*
* Convert the the vs in a range of cells into slugs.
* #customfunction
*/
let slug = '';
slug = value.substring(0, 100).toLowerCase();
slug = slug.replace(/[^\w\s-]/g, '');
slug = slug.replace(/\s+/g, '-');
Logger.log(slug);
return slug;
}
I want to do the same thing without looping to parse the object data above or declaring a range of values and what not.
Any suggestions on how I can do this in a simple way like shown above without the need for declaring active spreadsheet, range values and looping.
The following script will give you an idea about how to approach this task.
It assumes that:
the json data described in your question is in Cell A2.
the max value will be inserted into cell D2
the currency will be inserted into cell E2
the color will be inserted into cell F2
The script uses temporary arrays to capture the values and then assign it to a 2d array.
If you have many rows of data, then you will need to create a loop. I suggest that you build the arraydata progressively, and only update the target range at the end of the loop. This will give you the most efficient outcome.
function so6031098604() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet = ss.getActiveSheet()
var content = JSON.parse(sheet.getRange("A2").getValue());
// temp arrar to capture the data
var temparray = [];
temparray.push(content[0]["amountMax"]);
temparray.push(content[0]["currency"]);
temparray.push(content[0]["color"]);
// second array to accept the row data
var arraydata =[];
arraydata.push(temparray)
// define the target range
var targetrange = sheet.getRange(2, 4, 1, 3);
// update with the arraydata
targetrange.setValues(arraydata);
}
You want a custom function that will return certain fields from a JSON array.
In the following example, the target cell can be a single cell or an array.
This example does not use an arrayformula. The mechanics of using an arrayformula with a custom function may be something that you can research here Custom SHEETNAME function not working in Arrayformula.
Note: A 30 second quota applies to the execution of a Custom function
/**
* gets the MaxAmount, Current and Color from the data
*
* #param {cell reference or range} range The range to analyse.
* #return amountMax,currency and color
* #customfunction
*/
function getJsonData(range) {
//so6031098606
// Test whether range is an array.
if (range.map) {
// if yes, then loop through the rows and build the row values
var jsonLine = [];
for (var i = 0; i < range.length; i++) {
var jsonValues=[];
var v = JSON.parse(range[i][0]);
jsonValues.push(v.amountMax);
jsonValues.push(v.currency);
jsonValues.push(v.color);
// aggregate the row values
jsonLine.push(jsonValues);
} // end i
return jsonLine;
} else {
// if no, then just return a single set of values
var v = JSON.parse(range);
var jsonValues = [];
jsonValues.push(v.amountMax);
jsonValues.push(v.currency);
jsonValues.push(v.color);
return [jsonValues];
}
}

Is it possible to pull different data sets from one column?

I've been trying to write some code that looks down one column with strings based on some simple formulas. I can't seem to get it to recognize the different sets of data and paste them where I want them.
I have tried re writing the code a few different ways in which is looks at all the data and just offsets the destination row by 1. But it does not recognize that it is pull different data.
Below is the code that works. What it does is starts from the 1st column 2nd row (where my data starts). The data is a list like;
A
1 Customer1
2 item1
3 item2
4 Item3
5
6 Customer2
7 Item1
The formulas that I have in those cells just concatenates some other cells.
Using a loop it looks through column A and find the blank space. It then "breaks" whatever number it stops on, the numerical A1 notation of the cell, it then finds the values for those cells and transposes them In another sheet in the correct row.
The issue I am having with the code this code that has worked the best is it doesn't read any of the cells as blank
(because of the formulas?) and it transposes all to the same row.
function transpose(){
var data = SpreadsheetApp.getActiveSpreadsheet();
var input =data.getSheetByName("EMAIL INPUT");
var output = data.getSheetByName("EMAIL OUTPUT");
var lr =input.getLastRow();
for (var i=2;i<20;i++){
var cell = input.getRange(i, 1).getValue();
if (cell == ""){
break
}
}
var set = input.getRange(2, 1, i-1).getValues();
output.getRange(2,1,set[0].length,set.length) .
.setValues(Object.keys(set[0]).map ( function (columnNumber) {
return set.map( function (row) {
return row[columnNumber];
});
}));
Logger.log(i);
Logger.log(set);
}
What I need the code to do is look through all the data and separate the sets of data by a condition.
Then Transpose that information on another sheet. Each set (or array) of data will go into a different row. With each component filling across the column (["customer1", "Item1","Item2"].
EDIT:
Is it Possible to pull different data sets from a single column and turn them into arrays? I believe being able to do that will work if I use "appendrow" to tranpose my different arrays to where I need them.
Test for the length of cell. Even if it is a formula, it will evaluate the result based on the value.
if (cell.length !=0){
// the cell is NOT empty, so do this
}
else
{
// the cell IS empty, so do this instead
}
EXTRA
This code takes your objective and completes the transposition of data.
The code is not as efficient as it might/should because it includes getRange and setValues inside the loop.
Ideally the entire Output Range could/should be set in one command, but the (unanswered) challenge to this is knowing in advance the maximum number rows per contiguous range so that blank values can be set for rows that have less than the maximum number of rows.
This would be a worthwhile change to make.
function so5671809203() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var inputsheetname = "EMAIL_INPUT";
var inputsheet = ss.getSheetByName(inputsheetname);
var outputsheetname = "EMAIL_OUTPUT";
var outputsheet = ss.getSheetByName(outputsheetname);
var inputLR =inputsheet.getLastRow();
Logger.log("DEBUG: the last row = "+inputLR);
var inputrange = inputsheet.getRange(1, 1,inputLR+1);
Logger.log("the input range = "+inputrange.getA1Notation());
var values = inputrange.getValues();
var outputval=[];
var outputrow=[];
var counter = 0; // to count number of columns in array
for (i=0;i<inputLR+1;i++){
Logger.log("DEBUG: Row:"+i+", Value = "+values [i][0]+", Length = "+values [i][0].length);
if (values [i][0].length !=0){
// add this to the output sheet
outputrow.push(values [i][0]);
counter = counter+1;
Logger.log("DEBUG: value = "+values [i][0]+" to be added to array. New Array Value = "+outputrow+", counter = "+counter);
}
else
{
// do nothing with the cell, but add the existing values to the output sheet
Logger.log("DEBUG: Found a space - time to update output");
// push the values onto an clean array
outputval.push(outputrow);
// reset the row array
outputrow = [];
// get the last row of the output sheet
var outputLR =outputsheet.getLastRow();
Logger.log("DEBUG: output last row = "+outputLR);
// defie the output range
var outputrange = outputsheet.getRange((+outputLR+1),1,1,counter);
Logger.log("DEBUG: the output range = "+outputrange.getA1Notation());
// update the values with array
outputrange.setValues(outputval);
// reset the row counter
counter = 0;
//reset the output value array
outputval=[];
}
}
}
Email Input and Output Sheets

Google sheets script (javascript) compare loop ignoring bottom 2 rows in spreadsheet

I wrote a script that compares rows between two sheets and deletes all matching rows on the second sheet (named 'temp'). I set the loop to start at the end of temp and decrement, working toward the top. The script works but it ignores the bottom two rows on 'temp'...how can I fix this? I want to ensure it will delete the bottom two rows on temp when they match the data set from the other sheet.
I have confirmed that the bottom two rows are in fact duplicates and should be caught by the script and deleted.
Script:
function trimTempSheet() {
var ss, s, s1, dt;
var dirname='X DIR'
var fs, f, fls, fl, name;
var ncols=1,i, newRows, rw;
ss=SpreadsheetApp.getActiveSpreadsheet();
s=ss.getSheetByName('Report Results');
name = 'temp';
//Load current sheet to compare
var currentDataSet = s.getRange("A:S").getValues(); //Ignore final columns
var newSheet = ss.getSheetByName(name);
//Load imported data to compare
var newData = newSheet.getRange("A:S").getValues();
var headers = newData.shift();
//Create empty array to store data to be written [to add later]
newRows=[];
//Compare data from newData with current data
for(var i = newData.length-1; i > 0; --i)
{
for(var j in currentDataSet)
{
if(newData[i].join() == currentDataSet[j].join() )
{
newSheet.deleteRow(i);
}
}
}
How about this modification?
From :
for(var i = newData.length-1; i > 0; --i)
To :
for(var i = newData.length-1; i >= 0; --i)
Note :
About "the bottom two rows on 'temp'",
By above modification, one loop is added to the for loop.
In your script, the first element of newData is removed by var headers = newData.shift();. By this, newData decreases one element.
For example, how about also removing this line?
If this modification was not useful for you, I'm sorry. At that time, can you show us your sample spreadsheet?
I do not know the optimal way to force the script to look at and delete the last row first, without skipping it. So I changed the delete line to "newSheet.deleteRow(i+1);"
Will this produce any unintended row deletes based on my iteration loops? I am not an expert at how google scripts handles arrays. I assume the loops will examine rows in newData (the sheet named 'temp) sequentially, from the last row to the first. In which case my solution would be acceptable. But I am not certain of this.

Use google script to put 2 columns into a single multi dimensional array

I'm looking for a way to take 2 columns in a google spreadsheet and merge them into a single array in hopes that I can take these 2 columns and use setValues on a new sheet.
Why?
I'm eventually taking 2 different sheets and basically doing a large scale vlookup and transferring all results and desired columns into a single, new sheet. I can get the full dataRange, loop through each array, grabbing the values I want and pushing them to a new array. But is there an easier way? If I can look through just row1 and get the headers and their index, can I just put all of column A and column D in a multi-dimensional array?
Example
Header1 | H2 | H3
I want H1 and H3 and their rows so I can put them in a new sheet as such
Multi-Dimensional Array:
[ [H1, H3], [dataH1,dataH3] ]
Current Code
var freqArr = new Array(); //Array with sheet data
var myArray = new Array(); //Blank array to house header index
var freqSheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('KEY_test_test');
var freqData = freqSheet.getDataRange(); //all data
var freqNumRows = freqData.getNumRows(); //number of rows
var freqNumCol = freqData.getNumColumns(); //number of columns
freqArr = freqSheet.getRange(1, 1, freqNumRows, freqNumCol).getValues();
for (i = 0;i<1;++i){
for (j = 0;j<freqNumCol;++j){
if (freqArr[i][j].toString() == 'Header1' || freqArr[i][j].toString() == 'Header3'){
myArray.push([j]);
}
}
}
Logger.log(myArray);
Where I'm Stuck
What I'm doing right now is looping through the first row to get the header indexes I want (should look like this [ 0, 2 ]) but all that is returning in my log is []. I plan to use this array of indexes to loop through my freqData and grab the indexes of each nested array.
Any advice would be great. I'm just starting to learn google script and I'm teaching myself. Thanks
UPDATE TO CODE:
It turns out that .toString() == 'Header1' will not return a match but after more google fu, I found .toString().match('Header1') == 'Header1' will return what I need. See below for update
var freqArr = new Array(); //Array with sheet data
var myArray = new Array(); //Blank array to house header index
var freqSheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('KEY_test_test');
var freqData = freqSheet.getDataRange(); //all data
var freqNumRows = freqData.getNumRows(); //number of rows
var freqNumCol = freqData.getNumColumns(); //number of columns
freqArr = freqSheet.getRange(1, 1, freqNumRows, freqNumCol).getValues();
for (i = 0;i<1;++i){
for (j = 0;j<freqNumCol;++j){
if (freqArr[i][j].toString().match('Header1') == 'Header1' || freqArr[i][j].toString().match('Header3') == 'Header3'){
myArray.push(j);
}
}
}
Logger.log(myArray);
will return [ 0.0 , 2.0 ].
But still, my question remains, is there a faster way to get 2(n) columns that are not side-by-side and put them into an array so that you can use .setValues?
Answer
But still, my question remains, is there a faster way to get 2(n) columns that are not side-by-side and put them into an array so that you can use .setValues?
Yes, there are many ways. One of them is the use of a JavaScript method: array.prototype.forEach()
Code
function myFunction() {
var sheet = SpreadsheetApp.getActiveSheet();
var data = sheet.getDataRange().getValues();
var array = [];
data.forEach(function(row){
array.push([row[0],row[5]]);
});
sheet.getRange(1,10,array.length,2).setValues(array);
}
Explanation
Get the active sheet
var sheet = SpreadsheetApp.getActiveSheet();
Get the all the values on sheet
var data = sheet.getDataRange().getValues();
Initialize a variable to hold the array
var array = [];
Get the values of the first and sixth columns (A and F) (zero based index)
data.forEach(function(row){
array.push([row[0],row[5]]);
});
Return the values to a range starting on J1 and ending on column K and the required row (one based index)
sheet.getRange(1,10,array.length,2).setValues(array);
Take a look at the getRowsData() function on the Simple Mail Merge tutorial. It will get all the data in a sheet and return it as objects. You could then access the data as myData[i].header1 It will remove spaces and "normalize" the header. So a header such as My Header name will be myData[i].myHEaderName
You could limit the returned data to only the columns you need if you wish.

Categories

Resources