Using Google sheets to tally data sets - javascript

I have tried many formulas but i am still not able to get what i want. I need help to write an APP SCRIPT code for it. The problem is that I have to match two data sets and return the value of the adjacent cell. I want the sheet to pick a value from first cell of first row from a sheet and match it to entire cells of a row from other sheet (in the same workbook) and then paste the value which was being matched, infront of the cell which matches it. Now the problem is that my data sets are not equal so i can not use vlookup, i want to match and how much percentage it is matching. So highest percentage should be considered as a match. Kindly visit this link for an example in google sheet. [https://docs.google.com/spreadsheets/d/1u_-64UvpirL2JHpgA--GDa263wVb2idIhIYZlFnX2xQ/edit?usp=sharing]

There are a variety of ways to do this sort of partial matching, depending on the real data and how sophisticated you need to match logic to be.
Let's start with the simplest solution first. Did you know you can use wildcards in VLOOKUP? See Vlookup in Google Sheets using wildcards for partial matches.
So for your example data, add a column C to "Set 1" with the formula:
=VLOOKUP("*" & A2 & "*",'Set 2'!A1:A5,1,FALSE)
Obviously, this method fails if "Baseball bat" was supposed the be results for "Ball" instead of "Ballroom". VLOOKUP will simply return the first result that matches. This method also ignores case sensitivity. Finally, this method only works for appending data to set 1 from set 2, not the other way around. Without knowing more about the actual dataset, it's hard to give a solid solution.

Related

Google App Script to Append Value from one Cell to String of Numbers in another Cell

I’ve been trying to figure out how to write a script which will take the value from one cell and append it to the end of a string of numbers in another cell of that same row. The newly appended number needs to be separated by a comma from the previously appended value, and the whole string needs to be wrapped between brackets. EX. [2,3,3,4.5,2.5,2.1,1.3,0.4]. The script will need to loop through all of the rows containing data on a named sheet beginning with the third row.
The above image is obviously just an example containing only two rows of data. The actual spreadsheet will contain well over a thousand rows, so the operation must be done programmatically and will run weekly using a timed trigger.
To be as specific as I can, what I need help with is to first know if something like the appending is even possible in Google App Scripts. I've spent hours searching and I can't seem to find a way to append a new value (ex. cell A3) to the current string (ex. cell B3) without overwriting it completely.
In full disclosure; I'm a middle school teacher trying to put something together for my school.
To be as specific as I can, what I need help with is to first know if something like the appending is even possible in Google App Scripts.
Seeing the expected result, it's inserting rather than appending, as the string should be added before the last character (]). Anyway, yes, this is possible by using JavaScript string handling methods.
Use getValue() to the get the cell values, both the Current GPA and the GPA History.
One way is to use replace
Example using pure JavaScript:
var currentGPA = 3.5
var gpaHistory = '[2,3.1,2.4]';
gpaHistory = gpaHistory.replace(']',','+currentGPA+']');
console.info(gpaHistory)
Once you get the modified gpaHistory, use setValue(gpaHistory) to add this value to the spreadsheet.

How to search for closest tag set match in JavaScript?

I have a set of documents, each annotated with a set of tags, which may contain spaces. The user supplies a set of possibly misspelled tags and I wants to find the documents with the highest number of matching tags (optionally weighted).
There are several thousand documents and tags but at most 100 tags per document.
I am looking on a lightweight and performant solution where the search should be fully on the client side using JavaScript but some preprocessing of the index with node.js is possible.
My idea is to create an inverse index of tags to documents using a multiset, and a fuzzy index that that finds the correct spelling of a misspelled tag, which are created in a preprocessing step in node.js and serialized as JSON files. In the search step, I want to consult for each item of the query set first the fuzzy index to get the most likely correct tag, and, if one exists to consult the inverse index and add the result set to a bag (numbered set). After doing this for all input tags, the contents of the bag, sorted in descending order, should provide the best matching documents.
My Questions
This seems like a common problem, is there already an implementation for it that I can reuse? I looked at lunr.js and fuse.js but they seem to have a different focus.
Is this a sensible approach to the problem? Do you see any obvious improvements?
Is it better to keep the fuzzy step separate from the inverted index or is there a way to combine them?
You should be able to achieve what you want using Lunr, here is a simplified example (and a jsfiddle):
var documents = [{
id: 1, tags: ["foo", "bar"],
},{
id: 2, tags: ["hurp", "durp"]
}]
var idx = lunr(function (builder) {
builder.ref('id')
builder.field('tags')
documents.forEach(function (doc) {
builder.add(doc)
})
})
console.log(idx.search("fob~1"))
console.log(idx.search("hurd~2"))
This takes advantage of a couple of features in Lunr:
If a document field is an array, then Lunr assumes the elements are already tokenised, this would allow you to index tags that include spaces as-is, i.e. "foo bar" would be treated as a single tag (if this is what you wanted, it wasn't clear from the question)
Fuzzy search is supported, here using the query string format. The number after the tilde is the maximum edit distance, there is some more documentation that goes into the details.
The results will be sorted by which document best matches the query, in simple terms, documents that contain more matching tags will rank higher.
Is it better to keep the fuzzy step separate from the inverted index or is there a way to combine them?
As ever, it depends. Lunr maintains two data structures, an inverted index and a graph. The graph is used for doing the wildcard and fuzzy matching. It keeps separate data structures to facilitate storing extra information about a term in the inverted index that is unrelated to matching.
Depending on your use case, it would be possible to combine the two, an interesting approach would be a finite state transducers, so long as the data you want to store is simple, e.g. an integer (think document id). There is an excellent article talking about this data structure which is similar to what is used in Lunr - http://blog.burntsushi.net/transducers/

How to lowercase field name in pdi (pentaho)?

I'm actually new to PDI and i need to do some extract from csv however sometimes field name are in lowercase or uppercase.
I know how to modify it for rows but don't know how to do it for fields names.
Does exist a step to do it?
I tried ${fieldName}.lower(), lower(${fieldName}) in select value and javascript script but without succes
thanks in advance
The quick fix is to right-click the list of column provided by the CSV file input to copy/paste it back and forth into Excel (or whatever).
If you also have 150 input files, the step which changes dynamically the column names (and other metadata like type) is called Metadata Injection, Kettle doc. The Official doc gives details and examples.
Your specific case is covered in BizCubed. Download the sample near the end of the web page, unzip, load the ktr in PDI. You'll need to adapt the Fields step in the MetaDataInjection transformation. It is currently a DataGrid that you may change by a Javascript lowercase (or better a String operation), after having kept the first line only of your CSV (read with header NOT present, include the rownumber and Filter rownumber=1).
If you want to change a column name you can use the 'Select values' step.
There is a 'Rename to' option in the 'Select & Alter' tab as well as the 'Meta-data' tab that you can use to change a column name to whatever you want.

How to write a Google Sheets script function that identifies value depending on cell in the same row

How would I go about writing a custom function in Google Sheets that retrieves a value if the text in a cell in the same row equals what's identified in the function.
For example: If I have a bunch of values from a form, a function that'll allow me to identify the full range of where to look, the column that the number values are contained in, and the text I want the function to look for in the given range.
It's a bit confusing to just describe it. I'll make an example sheet and post it here.
Have you looked into MATCH?
If I understand the question, you could do something like:
MATCH(A1, A2:A100, 1)
Where A1 is your key, A2:A100 is your range, and 1 is the search type.
https://support.google.com/docs/answer/3093378

Empty data cell validation

How can I add validation on cell, that checks if the cell is not empty after creating new data record? Something like mandatory fields.
I think this is not possible (in the way I think you are requesting). From here:
If data validation is applied to cells containing data, rules won't be applied until the data is modified.
That does not actually say "No, it is not possible" but I think can be inverted as, "No modification, no trigger for validation" - and I'm assuming your cells start off empty (though for blanks I think it does not actually make any difference).
I suggest considering alternatives, possibly conditional formatting of cells that are empty (there is specific provision for Cell is empty), a formula to count the number of blanks cells in a range that should be populated (since you mention records you might prefer a row to be flagged rather than each individual blank cell), or resorting to a script.

Categories

Resources