Javascript: Removing Semi-Duplicate Objects within an Array with Conditions

Javascript: Removing Semi-Duplicate Objects within an Array with Conditions - javascript

I am trying to remove the "Duplicate" objects within an array while retaining the object that has the lowest value associated with it.
~~Original
var array = [
{
"time": "2021-11-12T20:37:11.112233Z",
"value": 3.2
},
{
"time": "2021-11-12T20:37:56.115222Z",
"value": 3.8
},
{
"time": "2021-11-13T20:37:55.112255Z",
"value": 4.2
},
{
"time": "2021-11-13T20:37:41.112252Z",
"value": 2
},
{
"time": "2021-11-14T20:37:22.112233Z",
"value": 3.2
}
]
~~Expected Output
var array = [
{
"time": "2021-11-12T20:37:11.112233Z",
"value": 3.2
},
{
"time": "2021-11-13T20:37:41.112252Z",
"value": 2
},
{
"time": "2021-11-14T20:37:22.112233Z",
"value": 3.2
}
]
What I have so far:
var result = array.reduce((aa, tt) => {
if (!aa[tt.time]) {
aa[tt.time] = tt;
} else if (Number(aa[tt.time].value) < Number(tt.value)) {
aa[tt.time] = tt;
}
return aa;
}, {});
console.log(result);
I realize the issue with what I am trying to do is that the "time" attribute is not identical to the other time values I am considering as duplicates.
Though for this use case I do not need the time out to ms. YYYY-MM-DDTHH:MM (to the minute) is fine. I am not sure how to implement a reduction method for this case when the time isnt exactly the same. Maybe if only the first 16 characters were checked in the string?
Let me know if any additional information is needed.

So a few issues:
If you want to only check the first 16 characters to detect a duplicate, you should use that substring of tt.time as key for aa instead of the whole string.
Since you want the minimum, your comparison operator is wrong.
The code produces an object, while you want an array, so you still need to extract the values from the object.
Here is your code with those adaptations:
var array = [{"time": "2021-11-12T20:37:11.112233Z","value": 3.2},{"time": "2021-11-12T20:37:56.115222Z","value": 3.8},{"time": "2021-11-13T20:37:55.112255Z","value": 4.2},{"time": "2021-11-13T20:37:41.112252Z","value": 2},{"time": "2021-11-14T20:37:22.112233Z","value": 3.2}];
var result = Object.values(array.reduce((aa, tt) => {
var key = tt.time.slice(0, 16);
if (!aa[key]) {
aa[key] = tt;
} else if (Number(aa[key].value) > Number(tt.value)) {
aa[key] = tt;
}
return aa;
}, {}));
console.log(result);

Related

I want to loop a specified number of times and display the array if any of the conditions are met

If there is the following data and loops the specified number of times (example: 6 times) and number = n at the nth time, the contents of number are displayed in console.log, if there is no data of number = n I want to do "no data". In this case, should I use the for statement or map?
In Python, I used Pandas to fill in the data, but how can I implement this in JavaScript?
I would appreciate it if you could tell me how to do this in JavaScript.
item = [{
"id": 1,
"number": 2
},
{
"id": 2,
"number": 3
}
]
In this case↓
no data
2
3
no data
no data
no data

item = [{
"id": 1,
"number": 2
},
{
"id": 2,
"number": 3
}
]
n = 6
const obj = {}
item.forEach((data) => {
obj[data.number] = true
})
for (let i=1;i<=n;i++) {
if (obj[i])
console.log(i);
else
console.log("no data")
}

JS dynamically access & delete nested object property [duplicate]

This is the sample json:
{
"search": {
"facets": {
"author": [
],
"language": [
{
"value": "nep",
"count": 3
},
{
"value": "urd",
"count": 1
}
],
"source": [
{
"value": "West Bengal State Council of Vocational Education & Training",
"count": 175
}
],
"type": [
{
"value": "text",
"count": 175
}
],
}
}
There are several ways to delete key search.facets.source:
delete search.facets.source
delete jsobObj['search']['facets']['source']
var jsonKey = 'source';
JSON.parse(angular.toJson(jsonObj), function (key, value) {
if (key != jsonKey)
return value;
});
Above 1 & 2 are not dynamic, and 3 is one of the way but not a proper way. Because if source is present in another node then it will not work. Please anybody can tell me how to delete it dynamically in any kind of nested key. Because we can not generate sequence of array dynamically in above 2.

Assuming you're starting from this:
let path = 'search.facets.source';
Then the logic is simple: find the search.facets object, then delete obj['source'] on it.
Step one, divide the path into the initial path and trailing property name:
let keys = path.split('.');
let prop = keys.pop();
Find the facets object in your object:
let parent = keys.reduce((obj, key) => obj[key], jsonObj);
Delete the property:
delete parent[prop];

I have found out another solution, it is very easy.
var jsonKey = 'search.facets.source';
eval('delete jsonObj.' + jsonKey + ';');

replace multiple values in json/jsObject/string

I have a response from a web service and want to replace some values in the response with my custom values.
One way is to write a tree traverser and then check for the value and replace with my custom value
so the response is some what like this:
[
{
"name": "n1",
"value": "v1",
"children": [
{
"name": "n2",
"value": "v2"
}
]
},
{
"name": "n3",
"value": "v3"
}
]
now my custom map is like this
const map = {
"v1": "v11",
"v2": "v22",
"v3": "v33"
};
All I want is
[
{
"name": "n1",
"value": "v11",
"children": [
{
"name": "n2",
"value": "v22"
}
]
},
{
"name": "n3",
"value": "v33"
}
]
I was thinking if I could stringify my response and then replace values using a custom build regex from my map of values.
Will it be faster as compared to tree traverser?
If yes, how should I do that?
somewhat like this
originalString.replace(regexp, function (replacement))

The tree traversal is faster
Note that some things could be done more efficiently in the regex implementation but I still think there are some more bottlenecks to explain.
Why the regex is slow:
There are probably many more reasons why the regex is slower but I'll explain at least one significant reason:
When you're using regex to find and replace, you're using creating new strings every time and performing your matches every time. Regex expressions can be very expensive and my implementation isn't particularly cheap.
Why is the tree traversal faster:
In the tree traversal, I'm mutating the object directly. This doesn't require creating new string objects or any new objects at all. We're also not performing a full search on the whole string every time as well.
RESULTS
run the performance test below. The test using console.time to record how long it takes. See the the tree traversal is much faster.
function usingRegex(obj, map) {
return JSON.parse(Object.keys(map).map(oldValue => ({
oldValue,
newValue: map[oldValue]
})).reduce((json, {
oldValue,
newValue
}) => {
return json.replace(
new RegExp(`"value":"(${oldValue})"`),
() => `"value":"${newValue}"`
);
}, JSON.stringify(obj)));
}
function usingTree(obj, map) {
function traverse(children) {
for (let item of children) {
if (item && item.value) {
// get a value from a JS object is O(1)!
item.value = map[item.value];
}
if (item && item.children) {
traverse(item.children)
}
}
}
traverse(obj);
return obj; // mutates
}
const obj = JSON.parse(`[
{
"name": "n1",
"value": "v1",
"children": [
{
"name": "n2",
"value": "v2"
}
]
},
{
"name": "n3",
"value": "v3"
}
]`);
const map = {
"v1": "v11",
"v2": "v22",
"v3": "v33"
};
// show that each function is working first
console.log('== TEST THE FUNCTIONS ==');
console.log('usingRegex', usingRegex(obj, map));
console.log('usingTree', usingTree(obj, map));
const iterations = 10000; // ten thousand
console.log('== DO 10000 ITERATIONS ==');
console.time('regex implementation');
for (let i = 0; i < iterations; i += 1) {
usingRegex(obj, map);
}
console.timeEnd('regex implementation');
console.time('tree implementation');
for (let i = 0; i < iterations; i += 1) {
usingTree(obj, map);
}
console.timeEnd('tree implementation');

Will it be faster as compared to tree traverser?
I don't know. I think it would depend on the size of the input, and the size of the replacement map. You could run some tests at JSPerf.com.
If yes, how should I do that?
It's fairly easy to do with a regex-based string replacement if the values you are replacing don't need any special escaping or whatever. Something like this:
const input = [
{
"name": "n1",
"value": "v1",
"children": [
{
"name": "n2",
"value": "v2"
}
]
},
{
"name": "n3",
"value": "v3"
}
];
const map = {
"v1": "v11",
"v2": "v22",
"v3": "v33"
};
// create a regex that matches any of the map keys, adding ':' and quotes
// to be sure to match whole property values and not property names
const regex = new RegExp(':\\s*"(' + Object.keys(map).join('|') + ')"', 'g');
// NOTE: if you've received this data as JSON then do the replacement
// *before* parsing it, don't parse it then restringify it then reparse it.
const json = JSON.stringify(input);
const result = JSON.parse(
json.replace(regex, function(m, key) { return ': "' + map[key] + '"'; })
);
console.log(result);

definitely traverser go faster as string replace means travels against each characters in the final string as opposed to iterator that can skips no necessarily item.

Is it possible to access a json array element without using index number?

I have the following JSON:
{
"responseObject": {
"name": "ObjectName",
"fields": [
{
"fieldName": "refId",
"value": "2170gga35511"
},
{
"fieldName": "telNum",
"value": "4541885881"
}]}
}
I want to access "value" of the the array element with "fieldName": "telNum" without using index numbers, because I don't know everytime exactly at which place this telNum element will appear.
What I dream of is something like this:
jsonVarName.responseObject.fields['fieldname'='telNum'].value
Is this even possible in JavaScript?

You can do it like this
var k={
"responseObject": {
"name": "ObjectName",
"fields": [
{
"fieldName": "refId",
"value": "2170gga35511"
},
{
"fieldName": "telNum",
"value": "4541885881"
}]
}};
value1=k.responseObject.fields.find(
function(i)
{return (i.fieldName=="telNum")}).value;
console.log(value1);

There is JSONPath that lets you write queries just like XPATH does for XML.
$.store.book[*].author the authors of all books in the store
$..author all authors
$.store.* all things in store, which are some books and a red bicycle.
$.store..price the price of everything in the store.
$..book[2] the third book
$..book[(#.length-1)]
$..book[-1:] the last book in order.
$..book[0,1]
$..book[:2] the first two books
$..book[?(#.isbn)] filter all books with isbn number
$..book[?(#.price<10)] filter all books cheapier than 10
$..* All members of JSON structure.

You will have to loop through and find it.
var json = {
"responseObject": {
"name": "ObjectName",
"fields": [
{
"fieldName": "refId",
"value": "2170gga35511"
},
{
"fieldName": "telNum",
"value": "4541885881"
}]
};
function getValueForFieldName(fieldName){
for(var i=0;i<json.fields.length;i++){
if(json.fields[i].fieldName == fieldName){
return json.fields[i].value;
}
}
return false;
}
console.log(getValueForFieldName("telNum"));

It might be a better option to modify the array into object with fieldName as keys once to avoid using .find over and over again.
fields = Object.assign({}, ...fields.map(field => {
const newField = {};
newField[field.fieldName] = field.value;
return newField;
}

It's not possible.. Native JavaScript has nothing similar to XPATH like in xml to iterate through JSON. You have to loop or use Array.prototype.find() as stated in comments.
It's experimental and supported only Chrome 45+, Safari 7.1+, FF 25+. No IE.
Example can be found here

Clean and easy way to just loop through array.
var json = {
"responseObject": {
"name": "ObjectName",
"fields": [
{
"fieldName": "refId",
"value": "2170gga35511"
},
{
"fieldName": "telNum",
"value": "4541885881"
}]
}
$(json.responseObject.fields).each(function (i, field) {
if (field.fieldName === "telNum") {
return field.value // break each
}
})

mapreduce with sort on inner document mongodb

I have a quick question on map-reduce with mongodb. I have this following document structure
{
"_id": "ffc74819-c844-4d61-8657-b6ab09617271",
"value": {
"mid_tag": {
"0": {
"0": "Prakash Javadekar",
"1": "Shastri Bhawan",
"2": "Prime Minister's Office (PMO)",
"3": "Narendra Modi"
},
"1": {
"0": "explosion",
"1": "GAIL",
"2": "Andhra Pradesh",
"3": "N Chandrababu Naidu"
},
"2": {
"0": "Prime Minister",
"1": "Narendra Modi",
"2": "Bharatiya Janata Party (BJP)",
"3": "Government"
}
},
"total": 3
}
}
when I am doing my map reduce code on this collection of documents I want to specify total as the sort field in this command
db.ana_mid_big.mapReduce(map, reduce,
{
out: "analysis_result",
sort: {"value.total": -1}
}
);
But this does not seem to work. How can I specify a key which is nested for sorting? Please help.
----------------------- EDIT ---------------------------------
as per the comments I am posting my whole problem here. I have started with a collection with a little more than 3.5M documents (this is just an old snap shot of the live one, which already crossed 5.5 M) which looks like this
{
"_id": ObjectId("53b394d6f9c747e33d19234d"),
"autoUid": "ffc74819-c844-4d61-8657-b6ab09617271"
"createDate": ISODate("2014-07-02T05:12:54.171Z"),
"account_details": {
"tag_cloud": {
"0": "FIFA World Cup 2014",
"1": "Brazil",
"2": "Football",
"3": "Argentina",
"4": "Belgium"
}
}
}
So, there can be many documents with the same autoUid but with different (or partially same or even same) tag_cloud.
I have written this following map-reduce to generate an intermediate collection which looks like the one at the start of the question. So, evidently that is collection of all the tag_clouds belongs to one person in a single document. To achieve this the MR code i used looks like the following
var map = function(){
final_val = {
tag_cloud: this.account_details.tag_cloud,
total: 1
};
emit(this.autoUid, final_val)
}
var reduce = function(key, values){
var fv = {
mid_tags: [],
total: 0
}
try{
for (i in values){
fv.mid_tags.push(values[i].tag_cloud);
fv.total = fv.total + 1;
}
}catch(e){
fv.mid_tags.push(values)
fv.total = fv.total + 1;
}
return fv;
}
db.my_orig_collection.mapReduce(map, reduce,
{
out: "analysis_mid",
sort: {createDate: -1}
}
);
Here comes problem Number-1 when somebody has more than one record it obeys reduce function. But when somebody has only one instead of naming it "mid_tag" it retains the name "tag_cloud". I understand that there is some problem with the reduce code but can not find what.
Now I want to reach to a final result which looks like
{"_id": "ffc74819-c844-4d61-8657-b6ab09617271",
"value": {
"tags": {
"Prakash Javadekar": 1,
"Shastri Bhawan": 1,
"Prime Minister's Office (PMO)": 1,
"Narendra Modi": 2,
"explosion": 1,
"GAIL": 1,
"Andhra Pradesh": 1,
"N Chandrababu Naidu": 1,
"Prime Minister": 1,
"Bharatiya Janata Party (BJP)": 1,
"Government": 1
}
}
Which is finally one document for each person representing the tag density they have used. The MR code I am trying to use (not tested yet) looks like this---
var map = function(){
var val = {};
if ("mid_tags" in this.value){
for (i in this.value.mid_tags){
for (j in this.value.mid_tags[i]){
k = this.value.mid_tags[i][j].trim();
if (!(k in val)){
val[k] = 1;
}else{
val[k] = val[k] + 1;
}
}
}
var final_val = {
tag: val,
total: this.value.total
}
emit(this._id, final_val);
}else if("tag_cloud" in this.value){
for (i in this.value.tag_cloud){
k = this.value.tag_cloud[i].trim();
if (!(k in val)){
val[k] = 1;
}else{
val[k] = val[k] + 1;
}
}
var final_val = {
tag: val,
total: this.value.total
}
emit(this._id, final_val);
}
}
var reduce = function(key, values){
return values;
}
db.analysis_mid.mapReduce(map, reduce,
{
out: "analysis_result"
}
);
This last piece of code is not tested yet. That is all I want to do. Please help

Your PHP background appears to be showing. The data structures you are representing are not showing arrays in typical JSON notation, however there are noted calls to "push" in your mapReduce code that at least in your "interim document" the values are actually arrays. You seem to have "notated" them the same way so it seems reasonable to presume they are.
Actual arrays are your best option for storage here, especially considering your desired outcome. So even if they do not, your original documents should look like this, as they would be represented in the shell:
{
"_id": ObjectId("53b394d6f9c747e33d19234d"),
"autoUid": "ffc74819-c844-4d61-8657-b6ab09617271"
"createDate": ISODate("2014-07-02T05:12:54.171Z"),
"account_details": {
"tag_cloud": [
"FIFA World Cup 2014",
"Brazil",
"Football",
"Argentina",
"Belgium"
]
}
}
With documents like that or if you change them to be like that, then your right tool for doing this is the aggregation framework. That works in native code and does not require JavaScript interpretation, hence it is much faster.
An aggregation statement to get to your final result is like this:
db.collection.aggregate([
// Unwind the array to "de-normalize"
{ "$unwind": "$account_details.tag_cloud" },
// Group by "autoUid" and "tag", summing totals
{ "$group": {
"_id": {
"autoUid": "$autoUid",
"tag": "$account_details.tag_cloud"
},
"total": { "$sum": 1 }
}},
// Sort the results to largest count per user
{ "$sort": { "_id.autoUid": 1, "total": -1 }
// Group to a single user with an array of "tags" if you must
{ "$group": {
"_id": "$_id.autoUid",
"tags": {
"$push": {
"tag": "$_id.tag",
"total": "$total"
}
}
}}
])
Slightly different output, but much simpler to process and much faster:
{
"_id": "ffc74819-c844-4d61-8657-b6ab09617271",
"tags": [
{ "tag": "Narendra Modi", "total": 2 },
{ "tag": "Prakash Javadekar", "total": 1 },
{ "tag": "Shastri Bhawan", "total": 1 },
{ "tag": "Prime Minister's Office (PMO)", "total": 1 },
{ "tag": "explosion", "total": 1 },
{ "tag": "GAIL", "total": 1 },
{ "tag": "Andhra Pradesh", "total": 1 },
{ "tag": "N Chandrababu Naidu", "total": 1 },
{ "tag": "Prime Minister", "total": 1 },
{ "tag": "Bharatiya Janata Party (BJP)", "total": 1 },
{ "tag": "Government", "total": 1 }
]
}
Also sorted by "tag relevance score" for the user for good measure, but you can look at dropping that or even both of the last stages as is appropriate to your actual case.
Still, by far the best option. Get to learn how to use the aggregation framework. If your "output" will still be "big" ( over 16MB ) then try to look at moving to MongoDB 2.6 or greater. Aggregate statements can produce a "cursor" which can be iterated rather than pull all results at once. Also there is the $out operator which can create a collection just like mapReduce does.
If your data is actually in the "hash" like format of sub-documents how you indicate in your notation of this ( which follows a PHP "dump" convention for arrays ), then you need to use mapReduce as the aggregation framework cannot traverse "hash-keys" the way these are represented. Not the best structure, and you should change it if this is the case.
Still there are several corrections to your approach and this does in fact become a single step operation to the final result. Again though, the final output will contain and "array" of "tags", since it really is not good practice to use your "data" as "key" names:
db.collection.mapReduce(
function() {
var tag_cloud = this.account_details.tag_cloud;
var obj = {};
for ( var k in tag_cloud ) {
obj[tag_cloud[k]] = 1;
}
emit( this.autoUid, obj );
},
function(key,values) {
var reduced = {};
// Combine keys and totals
values.forEach(function(value) {
for ( var k in value ) {
if (!reduced.hasOwnProperty(k))
reduced[k] = 0;
reduced[k] += value[k];
}
});
return reduced;
},
{
"out": { "inline": 1 },
"finalize": function(key,value) {
var output = [];
// Mapped to array for output
for ( var k in value ) {
output.push({
"tag": k,
"total": value[k]
});
}
// Even sorted just the same
return output.sort(function(a,b) {
return ( a.total < b.total ) ? -1 : ( a.total > b.total ) ? 1 : 0;
});
}
}
)
Or if it actually is an "array" of "tags" in your original document but your end output will be too big and you cannot move up to a recent release, then the initial array processing is just a little different:
db.collection.mapReduce(
function() {
var tag_cloud = this.account_details.tag_cloud;
var obj = {};
tag_cloud.forEach(function(tag) {
obj[tag] = 1;
});
emit( this.autoUid, obj );
},
function(key,values) {
var reduced = {};
// Combine keys and totals
values.forEach(function(value) {
for ( var k in value ) {
if (!reduced.hasOwnProperty(k))
reduced[k] = 0;
reduced[k] += value[k];
}
});
return reduced;
},
{
"out": { "replace": "newcollection" },
"finalize": function(key,value) {
var output = [];
// Mapped to array for output
for ( var k in value ) {
output.push({
"tag": k,
"total": value[k]
});
}
// Even sorted just the same
return output.sort(function(a,b) {
return ( a.total < b.total ) ? -1 : ( a.total > b.total ) ? 1 : 0;
});
}
}
)
Everything essentially follows the same principles to get to the end result:
De-normalize to a "user" and "tag" combination with "user" and the grouping key
Combine the results per user with a total on "tag" values.
In the mapReduce approach here, apart from being cleaner than what you seemed to be trying, the other main point to consider here is that the reducer needs to "output" exactly the same sort of "input" that comes from the mapper. The reason is actually well documented, as the "reducer" can in fact get called several times, basically "reducing again" output that has already been through reduce processing.
This is generally how mapReduce deals with "large inputs", where there are lots of values for a given "key" and the "reducer" only processes so many of them at one time. For example a reducer may actually only take 30 or so documents emitted with the same key, reduce two sets of those 30 down to 2 documents and then finally reduce to a single output for a single key.
The end result here is the same as the other output shown above, with the mapReduce difference that everything is under a "value" key as that is just how it works.
So a couple of ways to do it depending on your data. Do try to stick with the aggregation framework where possible as it is much faster and modern versions can consume and output just as much data as you can throw at mapReduce.

Develop Reference

JavaScript is the programming language of the Web.

Javascript: Removing Semi-Duplicate Objects within an Array with Conditions - javascript

Related

I want to loop a specified number of times and display the array if any of the conditions are met

JS dynamically access & delete nested object property [duplicate]

replace multiple values in json/jsObject/string

Is it possible to access a json array element without using index number?

mapreduce with sort on inner document mongodb

Categories

Resources