I apologize for so many observable questions lately, but I'm still having a really tough time grasping how to chain everything together.
I have a user, who is using promise-based storage to store the names of feeds they do not want to see. On the Social Feeds widget, they get to see the latest article from each feed that they have not filtered out.
I'd like to take a union on the hard-coded list of feeds and the feeds they want to hide. To work with the API I've been given, I need to make multiple calls to the service to retrieve each feed individually.
After I make that union, I'm looking to combine, sequentially, the observable that the utility getFeed method produces.
Here's what I'm looking to do with some pseduocode.
/**
* This gets the top items from all available social media sources.
* #param limit {number} The number of items to get per source.
* #returns {Observable<SocialItem[]} Returns a stream of SocialItem arrays.
*/
public getTopStories(limit: number = 1): Observable<SocialItem[]> {
// Merge the list of available feeds with the ones the user wants to hide.
const feedsToGet = this.storage.get('hiddenFeeds')
.then(hiddenFeeds => _.union(FeedList, hiddenFeeds));
// Let's use our function that retrieves the feeds and maps them into an Observable<SocialItem[]>.
// We need to splice the list because only 'limit' amount of articles can come back from each feed, and the API cannot accommodate sending anything else than 25 items at a time.
// We need to do mergeMap in order to return a single array of SocialItem, instead of a 2D array.
const feeds$ = feedsToGet.map(feed => this.getFeed(feed).map(res = res ? res.slice(0, limit) : []).mergeMap(val => val));
// Let's combine them and return
return Observable.combineLatest(feed$);
}
Edit: Again, sorry for sparse code before.
The only issue with your example is that you are doing your manipulation in the wrong time frame. combineLatest needs an Observable array, not a Future of an Observable array, a hint that you need to combineLatest in the promise handler. The other half is the last step to coerce your Promise<Observable<SocialItem[]>> to Observable<SocialItem[]>, which is just another mergeMap away. All in all:
public getTopStories(limit: number = 1): Observable<SocialItem[]> {
// Merge the list of available feeds with the ones the user wants to hide.
const feeds_future = this.storage.get('hiddenFeeds')
.then(hiddenFeeds => Observable.combineLatest(_.map(
_.union(FeedList, hiddenFeeds),
feed => this.getFeed(feed).mergeMap(res => res ? res.slice(0, limit) : [])
))); // Promise<Observable<SocialItem[]>>
return Observable.fromPromise(feeds) // Observable<Observable<SocialItem[]>>
.mergeMap(v => v); // finally, Observable<SocialItem[]>
}
P.S. the projection function of mergeMap means you can map your values to Observables in the same step as they are merged, rather than mapping them and merging them separately.
Related
I had an issue to filter an observable array, in my case it's resources$ (an observable which contains all 'resourses' as JSON), and I have another observable called usedResources$, what I want to achieve is simply get unusedResources$ from those 2 variables (resources - usedResources = unusedResources), is there any RxJS way to achieve this?
If you have multiple streams and you want to combine each item from them somehow, usually that means either combineLatest or zip depending on your desired combination strategy.
combineLatest | documentation
If you want to compute the latest from the most recent item from each stream, regardless of how fast or slow they emit relative to eachother, you would use combineLatest; either Observable.combineLatest or the prototype based stream$.combineLatest which has the same effect but includes the stream you call it on instead of being a static factory. I personally use the static form more often for clarity.
This is probably what you want.
const unusedResources$ = Observable.combineLatest(
resources$,
usedResources$,
(resources, usedResources) => ({
something: resources.something - usedResources.something
})
);
const { Observable } = Rx;
const resources$ = Observable.interval(5000).map(i => ({
something: (i + 1) * 1000
}));
const usedResources$ = Observable.interval(1000).map(i => ({
something: (i + 1) * 10
}));
const unusedResources$ = Observable.combineLatest(
resources$,
usedResources$,
(resources, usedResources) => ({
something: resources.something - usedResources.something
})
);
unusedResources$.subscribe(
unusedResources => console.log(unusedResources)
);
<script src="https://unpkg.com/rxjs#5.4.0/bundles/Rx.min.js"></script>
zip | documentation
If you instead want to combine each item 1:1 i.e. waiting for every stream to emit an item for a given index, you can use zip. However, under the hood it uses an unbounded buffer, so if your streams don't emit at around the same interval you can potentially balloon your memory usage or even run out entirely. For the most part, this should only be used for streams which have a finite, predictable count and interval. For example, if you made N number of ajax calls and want to combine the results of them 1:1.
[{"creationDate":"2011-03-13T00:17:25.000Z","fileName":"IMG_0001.JPG"},
{"creationDate":"2009-10-09T21:09:20.000Z","fileName":"IMG_0002.JPG"}]
[{"creationDate":"2012-10-08T21:29:49.800Z","fileName":"IMG_0004.JPG",
{"creationDate":"2010-08-08T18:52:11.900Z","fileName":"IMG_0003.JPG"}]
I use a HTTP get method to receive data. Unfortunately, while I do receive this data in chunks, it is not sorted by creationDate DESCENDING.
I need to sort these objects by creationDate my expected result would be.
[{"creationDate":"2012-10-08T21:29:49.800Z","fileName":"IMG_0004.JPG"},
{"creationDate":"2011-03-13T00:17:25.000Z","fileName":"IMG_0001.JPG"}]
[{"creationDate":"2010-08-08T18:52:11.900Z","fileName":"IMG_0003.JPG"},
{"creationDate":"2009-10-09T21:09:20.000Z","fileName":"IMG_0002.JPG"}]
Here's what I tried:
dataInChunks.map(data => {
return data.sort((a,b)=> {
return new Date(b.creationDate).getTime() - new Date(a.creationDate).getTime();
});
})
.subscribe(data => {
console.log(data);
})
This works only but only 1 object at a time which results in giving me the very top result. I need some way to join these chunks together and sort them and in some way break the whole object again into chunks of two.
Are there any RSJX operators I can use for this?
If you know the call definitely completes (which it should) then you can just use toArray which as the name suggests returns an array, which you can then sort. The point of toArray is that it won't produce a stream of data but will wait until the observer completes and return all values:
var allData = dataInChunks.toArray().sort(/*sorting logic*/);
However, if you are required to show the data in the browser as it arrives (if the toArray() approach makes the UI feel unresponsive), then you will have to re-sort the increasing dataset as it arrives:
var allData =[];
dataInChunks
.bufferWithCount(4)
.subscribe(vals => {
allData = allData.concat(vals);
allData.sort(/* sort logic*/);
})
This is slightly hacky as it's relying on a variable outside the stream, but yet get the idea. It uses a buffer bufferWithCount which will allow you to limit the number of re-sorts you do.
TBH, I would just go with the toArray approach, which begs the question why it's an observable in the first place! Good luck.
I am currently parsing a list of js objects that are upserted to the db one by one, roughly like this with Node.js:
return promise.map(list,
return parseItem(item)
.then(upsertSingleItemToDB)
).then(all finished!)
The problem is that when the list sizes grew very big (~3000 items), parsing all the items in parallel is too memory heavy. It was really easy to add a concurrency limit with the promise library and not run out of memory that way(when/guard).
But I'd like to optimize the db upserts as well, since mongodb offers a bulkWrite function. Since parsing and bulk writing all the items at once is not possible, I would need to split the original object list in smaller sets that are parsed with promises in parallel and then the result array of that set would be passed to the promisified bulkWrite. And this would be repeated for the remaining sets if list items.
I'm having a hard time wrapping my head around how I can structure the smaller sets of promises so that I only do one set of parseSomeItems-BulkUpsertThem at time (something like Promise.all([set1Bulk][set2Bulk]), where set1Bulk is another array of parallel parser Promises?), any pseudo code help would be appreciated (but I'm using when if that makes a difference).
It can look something like this, if using mongoose and the underlying nodejs-mongodb-driver:
const saveParsedItems = items => ItemCollection.collection.bulkWrite( // accessing underlying driver
items.map(item => ({
updateOne: {
filter: {id: item.id}, // or any compound key that makes your items unique for upsertion
upsert: true,
update: {$set: item} // should be a key:value formatted object
}
}))
);
const parseAndSaveItems = (items, offset = 0, limit = 3000) => { // the algorithm for retrieving items in batches be anything you want, basically
const itemSet = items.slice(offset, limit);
return Promise.all(
itemSet.map(parseItem) // parsing all your items first
)
.then(saveParsedItems)
.then(() => {
const newOffset = offset + limit;
if (items.length >= newOffset) {
return parseAndSaveItemsSet(items, newOffset, limit);
}
return true;
});
};
return parseAndSaveItems(yourItems);
The first answer looks complete. However here are some other thoughts that came to mind.
As a hack-around, you could call a timeout function in the callback of your write operation before the next write operation performs. This can give your CPU and Memory a break inbetween calls. Even if you add one millisecond between calls, that is only adding 3 seconds if you have a total of 3000 write objects.
Or you can segment your array of insertObjects, and send them to their own bulk writer.
Preface
Notice: This question is about complexity. I use here a complex design pattern, which you don't need to understand in order to understand the question. I could have simplified it more, but I chose to keep it relatively untouched for the sake of preventing mistakes. The code is written in TypeScript which is a super-set of JavaScript.
The code
Regard the following class:
export class ConcreteFilter implements Filter {
interpret() {
// rows is a very large array
return (rows: ReportRow[], filterColumn: string) => {
return rows.filter(row => {
// I've hidden the implementation for simplicity,
// but it usually returns either an empty array or a very short one.
}
}).map(row => <string>row[filterColumn]);
}
}
}
It receives an array of report row, then it filters the array by some logic that I've hidden. Finally it does not return the whole row, but only one stringy column that is mentioned in filterColumn.
Now, take a look at the following function:
function interpretAnd (filters: Filter[]) {
return (rows: ReportRow[], filterColumn: string) => {
var runFilter = filters[0].interpret();
var intersectionResults = runFilter(rows, filterColumn);
for (var i=1; i<filters.length; i++) {
runFilter = filters[i].interpret();
var results = runFilter(rows, filterColumn);
intersectionResults = _.intersection(intersectionResults, results);
}
return intersectionResults;
}
}
It receives an array of filters, and returns a distinct array of all the "filterColumn"s that the filters returned.
In the for loop, I get the results (string array) from every filter, and then make an intersection operation.
The problem
The report row array is large so every runFilter operation is expensive (while on the other hand the filter array is pretty short). I want to iterate the report row array as fewer times as possible. Additionally, the runFilter operation is very likely to return zero results or very few.
Explanation
Let's say that I have 3 filters, and 1 billion report rows. the internal iterration, i.e. the iteration in ConcreteFilter, will happen 3 billion times, even if the first execution of runFilter returned 0 results, so I have 2 billion redundant iterations.
So, I could, for example, check if intersectionResults is empty in the beginning of every iteration, and if so, then break the loop. But I'm sure that there are better solutions mathematically.
Also if the first runFIlter exectuion returned say 15 results, I would expect the next exectuion to receive an array of only 15 report rows, meaning I want the intersection operation to influence the input of the next call to runFilter.
I can modify the report row array after each iteration, but I don't see how to do it in an efficient way that won't be even more expensive than now.
A good solution would be to remove the map operation, and then passing the already filtered array in each operation instead of the entire array, but I'm not allowed to do it because I must not change the results format of Filter interface.
My question
I'd like to get the best solution you could think of as well as an explanation.
Thanks a lot in advance to every one who would spend his time trying to help me.
Not sure how effective this will be, but here's one possible approach you can take. If you preprocess the rows by the filter column you'll have a way to retrieve the matched rows. If you typically have more than 2 filters then this approach may be more beneficial, however it will be more memory intensive. You could branch the approach depending on the number of filters. There may be some TS constructs that are more useful, not very familiar with it. There are some comments in the code below:
var map = {};
// Loop over every row, keep a map of rows with a particular filter value.
allRows.forEach(row => {
const v = row[filterColumn];
let items;
items = map[v] = map[v] || [];
items.push(row)
});
let rows = allRows;
filters.forEach(f => {
// Run the filter and return the unique set of matched strings
const matches = unique(f.execute(rows, filterColumn));
// For each of the matched strings, go and look up the remaining rows and concat them for the next filter.
rows = [].concat(...matches.reduce(m => map[v]));
});
// Loop over the rows that made it all the way through, extract the value and then unique() the collection
return unique(rows.map(row => row[filterColumn]));
Thinking about it some more, you could use a similar approach but just do it on a per filter basis:
let rows = allRows;
filters.forEach(f => {
const matches = f.execute(rows, filterColumn);
let map = {};
matches.forEach(m => {
map[m] = true;
});
rows = rows.filter(row => !!map[row[filterColumn]]);
});
return distinctify(rows.map(row => row[filterColumn]));
I'm trying to get into reactive programming. I use array-functions like map, filter and reduce all the time and love that I can do array manipulation without creating state.
As an exercise, I'm trying to create a filterable list with RxJS without introducing state variables. In the end it should work similar to this:
I would know how to accomplish this with naive JavaScript or AngularJS/ReactJS but I'm trying to do this with nothing but RxJS and without creating state variables:
var list = [
'John',
'Marie',
'Max',
'Eduard',
'Collin'
];
Rx.Observable.fromEvent(document.querySelector('#filter'), 'keyup')
.map(function(e) { return e.target.value; });
// i need to get the search value in here somehow:
Rx.Observable.from(list).filter(function() {});
Now how do I get the search value into my filter function on the observable that I created from my list?
Thanks a lot for your help!
You'll need to wrap the from(list) as it will need to restart the list observable again every time the filter is changed. Since that could happen a lot, you'll also probably want to prevent filtering when the filter is too short, or if there is another key stroke within a small time frame.
//This is a cold observable we'll go ahead and make this here
var reactiveList = Rx.Observable.from(list);
//This will actually perform our filtering
function filterList(filterValue) {
return reactiveList.filter(function(e) {
return /*do filtering with filterValue*/;
}).toArray();
}
var source = Rx.Observable.fromEvent(document.querySelector('#filter'), 'keyup')
.map(function(e) { return e.target.value;})
//The next two operators are primarily to stop us from filtering before
//the user is done typing or if the input is too small
.filter(function(value) { return value.length > 2; })
.debounce(750 /*ms*/)
//Cancel inflight operations if a new item comes in.
//Then flatten everything into one sequence
.flatMapLatest(filterList);
//Nothing will happen until you've subscribed
source.subscribe(function() {/*Do something with that list*/});
This is all adapted from one of the standard examples for RxJS here
You can create a new stream, that takes the list of people and the keyups stream, merge them and scans to filter the latter.
const keyup$ = Rx.Observable.fromEvent(_input, 'keyup')
.map(ev => ev.target.value)
.debounce(500);
const people$ = Rx.Observable.of(people)
.merge(keyup$)
.scan((list, value) => people.filter(item => item.includes(value)));
This way you will have:
-L------------------ people list
------k-----k--k---- keyups stream
-L----k-----k--k---- merged stream
Then you can scan it. As docs says:
Rx.Observable.prototype.scan(accumulator, [seed])
Applies an accumulator function over an observable sequence and returns each
intermediate result.
That means you will be able to filter the list, storing the new list on the accumulator.
Once you subscribe, the data will be the new list.
people$.subscribe(data => console.log(data) ); //this will print your filtered list on console
Hope it helps/was clear enough
You can look how I did it here:
https://github.com/erykpiast/autocompleted-select/
It's end to end solution, with grabbing user interactions and rendering filtered list to DOM.
You could take a look at WebRx's List-Projections as well.
Live-Demo
Disclosure: I am the author of the Framework.