Partitioning set such that cartesian product obeys constraint - javascript

I was reading this question, which describes the following problem statement:
You are given two ints: N and K. Lun the dog is interested in strings that satisfy the following conditions:
The string has exactly N characters, each of which is either 'A' or 'B'.
The string s has exactly K pairs (i, j) (0 <= i < j <= N-1) such that s[i] = 'A' and s[j] = 'B'.
If there exists a string that satisfies the conditions, find and return any such string. Otherwise, return an empty string
It occurs to me that this problem is equivalent to:
Determine whether there are any 2-partitions of 0...N-1 for which the cartesian product contains exactly K tuples (i, j) with i < j
Where the tuple elements represent assignments of the string index to the characters A and B.
This yields the very naive (but correct) implementation:
Determine all 2-partitions of the set 0...N-1
For each such partitioning, produce the cartesian product of the subsets
For each cartesian product, count the number of tuples (i, j) for which i < j
Choose any 2-partition for which this count is K
Here is an implementation in JS:
const test = ([l, r]) =>
cart(l, r).reduce((p, [li, ri]) => p + (li < ri ? 1 : 0), 0) === k
const indices = _.range(0, n)
const results = partitions(indices).filter(test)
You can test out the results in the context of the original problem here. Some example outputs for n = 13, k = 29:
"aababbbbbbbbb", "babaaabbbbbbb", "baabababbbbbb", "abbaababbbbbb", ...
The complexity for just the first step here is the number of ways to partion a set: this is the rather daunting Stirling number of the second kind S(n, k) for k = 2:
For e.g. n=13 this works out to 4095, which is not great.
Obviously if we only need a single partitioning that satisfies the requirement (which is what the original question asks for), and compute everything lazily, we will generally not go into the worst case. However in general, the approach here still seems quite wasteful, in that most of the partitions we compute never satisfy the property of having k tuples in the cartesian product for which i < j.
My question is whether there is some further abstraction or isomorphism that can be recognized to make this more efficient. E.g. is it possible to construct a subset of 2-partitions in such a way that the condition on the cartesian product is satisfied by construction?

(This is a method to algorithmically construct all solutions; you're probably looking for a more mathematical approach.)
In this answer to the linked question I give a method for finding the lexicographically smallest solution. This tells you what the smallest number of B's is with which you can construct a solution. If you turn the method on its head and start with a string of all B's and add A's from the left, you can find the highest number of B's with which you can construct a solution.
To construct all solutions for a specific number of B's in this range, you can again use a recursive method, but instead of only adding a B to the end and recursing once with N-1, you'd add B, then BA, then BAA... and recurse with all cases that will yield valid solutions. Consider again the example of N=13 and K=29, for which the minimum number of B's is 3 and the maximum is 10; you can construct all solutions for e.g. 4 B's like this:
N=13 (number of digits)
K=29 (number of pairs)
B= 4 (number of B's)
(13,29,4) =
(12,20,3) + "B"
(11,21,3) + "BA"
(10,22,3) + "BAA"
At this point you know that you've reached the end of the cases that will yield solutions, because (9/2)2 < 23. So at each level you recurse with:
N = N - length of added string
K = K - number of A's still to be added
B = B - 1
When you reach the recursion level where B is either 1 or N - 1, you can construct the string without further recursions.
Practically, what you're doing is that you start with the B's as much to the right as possible, and then one by one move them to the left while compensating this by moving other B's to the right, until you've reached the position where the B's are as much to the left as possible. See the output of this code snippet:
function ABstring(N, K, B, str) {
if ((N - B) * B < K) return;
str = str || "";
if (B <= 1 || B >= N - 1) {
for (var i = N - 1; i >= 0; i--)
str = (B == 1 && i == K || B == N - 1 && N - 1 - i != K || B == N ? "B" : "A") + str;
document.write(str + "<br>");
} else {
var prefix = "B";
--B;
while (--N) {
if (K - (N - B) >= 0 && B <= N)
ABstring(N, K - (N - B), B, prefix + str);
prefix += "A";
}
}
}
ABstring(13, 29, 4);
If you run this code for all values of B from 3 to 10, you get all 194 solutions for (N,K) = (13,29). Instead of calculating the minimum and maximum number of B's first, you can just run this algorithm for all values of B from 0 to N (and stop as soon as you no longer get solutions).
This is the pattern for (N,K,B) = (16,24,4):

Let P be function that for given AB string returns number of good pairs (i, j), s[i] = 'A', s[j] = 'B'.
First consider strings of length N where number of B's is fixed, say b. Strings that contain (N-b) A's. Call this set of string S_b. Min P on S_b is 0, with all B's on left side (call this string O). Max P on S_b is b*(N-b), with all B's are on right side. This is simple check for non-existence of s in S_b with required property.
Consider operation of swapping neighbouring BA -> AB. That operation changes P for +1. Using only that operation, starting from string O, is possible to construct every string with b B's. This gives us if b*(N-b) >= K than there is s in S_b with required property.
Rightmost B in O can move till the end of a string, N-b places. Since it is not possible to swap two B's, than B that is left of rightmost B can move as much as rightmost B, ... Number of moves B's (m_i) can make is 0 <= m_1 <= m_2 <= ... <= m_b <= N-b.
With that, finding all AB strings s of length N with b B's where P(s)=K is equivalent as finding all partition of integer K in at most b parts where part is <= N-b. To finding all strings it is needed to check all b where b*(N-b) >= K.

Related

Trying to optimize my code to either remove nested loop or make it more efficient

A friend of mine takes a sequence of numbers from 1 to n (where n > 0)
Within that sequence, he chooses two numbers, a and b
He says that the product of a and b should be equal to the sum of all numbers in the sequence, excluding a and b
Given a number n, could you tell me the numbers he excluded from the sequence?
Have found the solution to this Kata from Code Wars but it times out (After 12 seconds) in the editor when I run it; any ideas as too how I should further optimize the nested for loop and or remove it?
function removeNb(n) {
var nArray = [];
var sum = 0;
var answersArray = [];
for (let i = 1; i <= n; i++) {
nArray.push(n - (n - i));
sum += i;
}
var length = nArray.length;
for (let i = Math.round(n / 2); i < length; i++) {
for (let y = Math.round(n / 2); y < length; y++) {
if (i != y) {
if (i * y === sum - i - y) {
answersArray.push([i, y]);
break;
}
}
}
}
return answersArray;
}
console.log(removeNb(102));
.as-console-wrapper { max-height: 100% !important; top: 0; }
I think there is no reason for calculating the sum after you fill the array, you can do that while filling it.
function removeNb(n) {
let nArray = [];
let sum = 0;
for(let i = 1; i <= n; i++) {
nArray.push(i);
sum += i;
}
}
And since there could be only two numbers a and b as the inputs for the formula a * b = sum - a - b, there could be only one possible value for each of them. So, there's no need to continue the loop when you find them.
if(i*y === sum - i - y) {
answersArray.push([i,y]);
break;
}
I recommend looking at the problem in another way.
You are trying to find two numbers a and b using this formula a * b = sum - a - b.
Why not reduce the formula like this:
a * b + a = sum - b
a ( b + 1 ) = sum - b
a = (sum - b) / ( b + 1 )
Then you only need one for loop that produces the value of b, check if (sum - b) is divisible by ( b + 1 ) and if the division produces a number that is less than n.
for(let i = 1; i <= n; i++) {
let eq1 = sum - i;
let eq2 = i + 1;
if (eq1 % eq2 === 0) {
let a = eq1 / eq2;
if (a < n && a != i) {
return [[a, b], [b, a]];
}
}
}
You can solve this in linear time with two pointers method (page 77 in the book).
In order to gain intuition towards a solution, let's start thinking about this part of your code:
for(let i = Math.round(n/2); i < length; i++) {
for(let y = Math.round(n/2); y < length; y++) {
...
You already figured out this is the part of your code that is slow. You are trying every combination of i and y, but what if you didn't have to try every single combination?
Let's take a small example to illustrate why you don't have to try every combination.
Suppose n == 10 so we have 1 2 3 4 5 6 7 8 9 10 where sum = 55.
Suppose the first combination we tried was 1*10.
Does it make sense to try 1*9 next? Of course not, since we know that 1*10 < 55-10-1 we know we have to increase our product, not decrease it.
So let's try 2*10. Well, 20 < 55-10-2 so we still have to increase.
3*10==30 < 55-3-10==42
4*10==40 < 55-4-10==41
But then 5*10==50 > 55-5-10==40. Now we know we have to decrease our product. We could either decrease 5 or we could decrease 10, but we already know that there is no solution if we decrease 5 (since we tried that in the previous step). So the only choice is to decrease 10.
5*9==45 > 55-5-9==41. Same thing again: we have to decrease 9.
5*8==40 < 55-5-8==42. And now we have to increase again...
You can think about the above example as having 2 pointers which are initialized to the beginning and end of the sequence. At every step we either
move the left pointer towards right
or move the right pointer towards left
In the beginning the difference between pointers is n-1. At every step the difference between pointers decreases by one. We can stop when the pointers cross each other (and say that no solution can be obtained if one was not found so far). So clearly we can not do more than n computations before arriving at a solution. This is what it means to say that the solution is linear with respect to n; no matter how large n grows, we never do more than n computations. Contrast this to your original solution, where we actually end up doing n^2 computations as n grows large.
Hassan is correct, here is a full solution:
function removeNb (n) {
var a = 1;
var d = 1;
// Calculate the sum of the numbers 1-n without anything removed
var S = 0.5 * n * (2*a + (d *(n-1)));
// For each possible value of b, calculate a if it exists.
var results = [];
for (let numB = a; numB <= n; numB++) {
let eq1 = S - numB;
let eq2 = numB + 1;
if (eq1 % eq2 === 0) {
let numA = eq1 / eq2;
if (numA < n && numA != numB) {
results.push([numA, numB]);
results.push([numB, numA]);
}
}
}
return results;
}
In case it's of interest, CY Aries pointed this out:
ab + a + b = n(n + 1)/2
add 1 to both sides
ab + a + b + 1 = (n^2 + n + 2) / 2
(a + 1)(b + 1) = (n^2 + n + 2) / 2
so we're looking for factors of (n^2 + n + 2) / 2 and have some indication about the least size of the factor. This doesn't necessarily imply a great improvement in complexity for the actual search but still it's kind of cool.
This is part comment, part answer.
In engineering terms, the original function posted is using "brute force" to solve the problem, iterating every (or more than needed) possible combinations. The number of iterations is n is large - if you did all possible it would be
n * (n-1) = bazillio n
Less is More
So lets look at things that can be optimized, first some minor things, I'm a little confused about the first for loop and nArray:
// OP's code
for(let i = 1; i <= n; i++) {
nArray.push(n - (n - i));
sum += i;
}
??? You don't really use nArray for anything? Length is just n .. am I so sleep deprived I'm missing something? And while you can sum a consecutive sequence of integers 1-n by using a for loop, there is a direct and easy way that avoids a loop:
sum = ( n + 1 ) * n * 0.5 ;
THE LOOPS
// OP's loops, not optimized
for(let i = Math.round(n/2); i < length; i++) {
for(let y = Math.round(n/2); y < length; y++) {
if(i != y) {
if(i*y === sum - i - y) {
Optimization Considerations:
I see you're on the right track in a way, cutting the starting i, y values in half since the factors . But you're iterating both of them in the same direction : UP. And also, the lower numbers look like they can go a little below half of n (perhaps not because the sequence start at 1, I haven't confirmed that, but it seems the case).
Plus we want to avoid division every time we start an instantiation of the loop (i.e set the variable once, and also we're going to change it). And finally, with the IF statements, i and y will never be equal to each other the way we're going to create the loops, so that's a conditional that can vanish.
But the more important thing is the direction of transversing the loops. The smaller factor low is probably going to be close to the lowest loop value (about half of n) and the larger factor hi is probably going to be near the value of n. If we has some solid math theory that said something like "hi will never be less than 0.75n" then we could make a couple mods to take advantage of that knowledge.
The way the loops are show below, they break and iterate before the hi and low loops meet.
Moreover, it doesn't matter which loop picks the lower or higher number, so we can use this to shorten the inner loop as number pairs are tested, making the loop smaller each time. We don't want to waste time checking the same pair of numbers more than once! The lower factor's loop will start a little below half of n and go up, and the higher factor's loop will start at n and go down.
// Code Fragment, more optimized:
let nHi = n;
let low = Math.trunc( n * 0.49 );
let sum = ( n + 1 ) * n * 0.5 ;
// While Loop for the outside (incrementing) loop
while( low < nHi ) {
// FOR loop for the inside decrementing loop
for(let hi = nHi; hi > low; hi--) {
// If we're higher than the sum, we exit, decrement.
if( hi * low + hi + low > sum ) {
continue;
}
// If we're equal, then we're DONE and we write to array.
else if( hi * low + hi + low === sum) {
answersArray.push([hi, low]);
low = nHi; // Note this is if we want to end once finding one pair
break; // If you want to find ALL pairs for large numbers then replace these low = nHi; with low++;
}
// And if not, we increment the low counter and restart the hi loop from the top.
else {
low++;
break;
}
} // close for
} // close while
Tutorial:
So we set the few variables. Note that low is set slightly less than half of n, as larger numbers look like they could be a few points less. Also, we don't round, we truncate, which is essentially "always rounding down", and is slightly better for performance, (though it dosenit matter in this instance with just the single assignment).
The while loop starts at the lowest value and increments, potentially all the way up to n-1. The hi FOR loop starts at n (copied to nHi), and then decrements until the factor are found OR it intercepts at low + 1.
The conditionals:
First IF: If we're higher than the sum, we exit, decrement, and continue at a lower value for the hi factor.
ELSE IF: If we are EQUAL, then we're done, and break for lunch. We set low = nHi so that when we break out of the FOR loop, we will also exit the WHILE loop.
ELSE: If we get here it's because we're less than the sum, so we need to increment the while loop and reset the hi FOR loop to start again from n (nHi).

algorithm to determine if a number is made of sum of multiply of two other number

let say it's given 2k+2+3p=n as the test, how to find out the test is true for a number is valid for a number when k>=0, p>=0, n>=0:
example1 : n=24 should result true since k=5 & p=4 => 2(5)+2+3(4)=24
example2 : n=11 should result true since k=0 & p=3 => 2(0)+2+3(3)=11
example3 : n=15 should result true since k=5 & p=1 => 2(5)+2+3(1)=15
i wonder if there is a mathematic solution to this. i solved it like bellow:
//let say 2k+2+3p=n
var accepted = false;
var betterNumber= n-2;
//assume p=0
var kReminder= (betterNumber)%2==0;
//assume k=0
var pReminder= (betterNumber)%3==0;
if (kReminder || pReminder){
accepted=true;
}else{
var biggerChunk= Math.Max(2,3); //max of 2k or 3p, here i try to find the bigger chunk of the
var smallerChunk= Math.Min(2,3);
if ((betterNumber%bigger)%smallerChunk==0){
accepted=true;
}else
{
accepted=false;
}
}
still there are edge cases that i didn't see. so i wonder if it has a better solution or not.
Update
the test above is just an example. the solution should be efficient enough for big numbers or any combination of number like 1000000k+37383993+37326328393p=747437446239902
By inspection, 2 is the smallest valid even number and 5 is the smallest valid odd number:
2 is valid (k=0, p=0)
5 is valid (k=0, p=1)
All even numbers >= 2 and all odd numbers >= 5 are valid.
Even numbers: k=n/2-1, p=0
odd numbers: k=(n-3)/2-1, p=1
What we're doing here is incrementing k to add 2s to the smallest valid even and odd numbers to get all larger even and odd numbers.
All values of n >= 2 are valid except for 3.
Dave already gave a constructive and efficient answer but I'd like to share some math behind it.
For some time I'll ignore the + 2 part as it is of less significance and concentrate on a generic form of this question: given two positive integers a and b check whether number X can be represented as k*a + m*b where k and m are non-negative integers. The Extended Euclidean algorithm essentially guarantees that:
If number X is not divisible by GCD(a,b), it can't be represented as k*a + m*b with integer k and m
If number X is divisible by GCD(a,b) and is greater or equal than a*b, it can be represented as k*a + m*b with non-negative integer k and m. This follows from the fact that d = GCD(a,b) can be represented in such a form (let's call it d = k0*a + m0*b). If X = Y*d then X = (Y*k0)*a + (Y*m0)*b. If one of those two coefficients is negative you can trade one for the other adding and subtracting a*b as many times as required as in X = (Y*k0 + b)*a + (Y*m0 - a)*b. And since X >= a*b you can always get both coefficients to be non-negative in such a way. (Note: this is obviously not the most efficient way to find a suitable pair of those coefficients but since you only ask for whether such coefficients exist it should be sufficient.)
So the only gray area is numbers X divisible by GCD(a,b) that lie between in the (0, a*b) range. I'm not aware of any general rule about this area but you can check it explicitly.
So you can just do pre-calculations described in #3 and then you can answer this question pretty much immediately with simple comparison + possibly checking against pre-calculated array of booleans for the (0, a*b) range.
If you actual question is about k*a + m*b + c form where a, b and c are fixed, it is easily converted to the k*a + m*b question by just subtracting c from X.
Update (Big values of a and b)
If your a and b are big so you can't cache the (0, a*b) range beforehand, the only idea I have is to do the check for values in that range on demand by a reasonably efficient algorithm. The code goes like this:
function egcd(a0, b0) {
let a = a0;
let b = b0;
let ca = [1, 0];
let cb = [0, 1];
while ((a !== b) && (b !== 0)) {
let r = a % b;
let q = (a - r) / b;
let cr = [ca[0] - q * cb[0], ca[1] - q * cb[1]];
a = b;
ca = cb;
b = r;
cb = cr;
}
return {
gcd: a,
coef: ca
};
}
function check(a, b, x) {
let eg = egcd(a, b);
let gcd = eg.gcd;
let c0 = eg.coef;
if (x % gcd !== 0)
return false;
if (x >= a * b)
return true;
let c1a = c0[0] * x / gcd;
let c1b = c0[1] * x / gcd;
if (c1a < 0) {
let fixMul = -Math.floor(c1a / (b / gcd));
let c1bFixed = c1b - fixMul * (a / gcd);
return c1bFixed >= 0;
}
else { //c1b < 0
let fixMul = -Math.floor(c1b / (a / gcd));
let c1aFixed = c1a - fixMul * (b / gcd);
return c1aFixed >= 0;
}
}
The idea behind this code is based on the logic described in the step #2 above:
Calculate GCD and Bézout coefficients using the Extended Euclidean algorithm (if a and b are fixed, this can be cached, but even if not this is fairly fast anyway).
Check for conditions #1 (definitely no) and #2 (definitely yes) from the above
For value in the (0, a*b) range fix some coefficients by just multiplying Bézout coefficients by X/gcd. F
Find which of the two is negative and find the minimum multiplier to fix it by trading one coefficient for another.
Apply this multiplier to the other (initially positive) coefficient and check if it remains positive.
This algorithm works because all the possible solutions for X = k*a + m*b can be obtained from some base solution (k0, m0) using as (k0 + n*b/gcd, m0 + n*a/gcd) for some integer n. So to find out if there is a solution with both k >= 0 and m >= 0, all you need is to find the solution with minimum positive k and check m for it.
Complexity of this algorithm is dominated by the Extended Euclidean algorithm which is logarithmic. If it can be cached, everything else is just constant time.
Theorem: it is possible to represent number 2 and any number >= 4 using this formula.
Answer: the easiest test is to check if the number equals 2 or is greater or equals 4.
Proof: n=2k+2+3p where k>=0, p>=0, n>=0 is the same as n=2m+3p where m>0, p>=0 and m=k+1. Using p=0 one can represent any even number, e.g. with m=10 one can represent n=20. The odd number to the left of this even number can be represented using m'=m-2, p=1, e.g. 19=2*8+3. The odd number to the right can be represented with m'=m-1, p=1, e.g. 21=2*9+3. This rule holds for m greater or equal 3, that is starting from n=5. It is easy to see that for p=0 two additional values are also possible, n=2, n=4.

Algorithm to find the power of 2

I have found a small algorithm to determine if a number is power of 2, but not an explanation for how it works, what really happens?
var potence = n => n && !(n & (n - 1));
for(var i = 2; i <= 16; ++i) {
if(potence(i)) console.log(i + " is potence of 2");
}
I'll explain how it works for non-negative n. The first condition in n && !(n & (n - 1)) simply checks that n is not zero. If n is not zero, then it has some least significant 1-bit at some position p. Now, if you subtract 1 from n, all bits before position p will change to 1, and the bit at p will flip to 0.
Something like this:
n: 1010100010100111110010101000000
n-1: 1010100010100111110010100111111
^ position p
Now, if you & these two bit-patterns, all the stuff after the position p remains unchanged, and everything before (and including p) is zeroed out:
after &: 1010100010100111110010100000000
^ position p
If the result after taking & happens to be zero, then it means that there was nothing after position p, thus the number must have been
2^p, which looked like this:
n: 0000000000000000000000001000000
n - 1: 0000000000000000000000000111111
n&(n-1): 0000000000000000000000000000000
^ position p
thus n is a power of 2. If the result of & is not zero (as in the first example), then it means that there was some junk in the more significant bits after the p-th position, and therefore n is not a power of 2.
I'm too lazy to play this through for the 2-complement representation of negative numbers.
If one number is potence of 2, then must be 10...0 in binary representation. Minus by 1, then then leading 1 should be 0, so that n & (n-1) to be 0. Otherwise, it is not the potence of 2.
Kinka's answer is essentially correct, but perhaps needs a bit more detail on the "otherwise" case. If the number isn't a power of two, then it must have the form n=(2^a + 2^(b) + y), where a>b and y <2^b. Subtracting 1 from that must be strictly greater than 2^a, so (n & (n-1)) is at least 2^a, and therefore non-zero.

JavaScript - Is there a way to find out whether the given characters are contained in a string without looping?

I have an array of combinations from 5 characters (order within a combination plays no role):
AB, ABDE, CDE, C, BE ...
On its basis I need to validate the input from user. The entered combination of characters should be contained in one of the combinations of the array.
If user enters "ADE" or "CE" the result should be yes, if e.g. "BCE" - no.
In a trivial case, when entered combination simply matches the one in array, I can use .inArray. If entered combination consists of neighbors, I can do .indexOf. How to be in the case above?
One of the solutions would be to extend the initial array by including all possible "child" combinations. Is there an alternative?
The first thing I could think of is grep'ping the array with a regex match.
var haystack = ["BCED","DBCE","CEB","ECBA","CB","BDCA"];
var needle = "CBE";
var re = new RegExp("(?=.*" + needle.split('').join(")(?=.*") + ").{" + needle.length+"}");
console.log(re);
console.log($.grep(haystack, function(str){
return str.match(re,"g");
}));
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
I'll expand on the comment I've made to the quesion above: If you have a small number of fixed set elements, you can represent the sets as binary masks. So say you have the original sets as strings:
var sset = ["AB", "ABDE", "CDE", "C", "BE"];
Create a dictionary of possible elements and bits. The bits are powers of two, which can be created by bit-shifting: 1 << n is bit n:
dict = {
A: (1 << 0),
B: (1 << 1),
C: (1 << 2),
D: (1 << 3),
E: (1 << 4),
};
That dictionary can then be used to create the bitmask:
function bitmask(s, d) {
let res = 0;
for (let i = 0; i < s.length; i++) {
res |= d[s[i]]
}
return res;
}
Create a companion array to the sets that contains the masks:
var mset = sset.map(function(x) { return bitmask(x, dict); });
If you want to check an input, cnvert it to a mask first and then run the checks. A set s contains all bits of an input x if (s & x) == x:
var s = "ADE";
var m = bitmask(s, dict);
for (let i = 0; i < mset.length; i++) {
console.log(sset[i], s, (mset[i] & m) == m);
}
You can use this strategy for several conditions:
• (a & b) == b — all elements of b are contained in a;
• (a & b) == 0 — a and b have no common elements;
• (a & b) != 0 — at least one elements of b is in a;
• a == b — the sets a and b are identical.
In set parlance a & b is the intersection, a | b is the union and a ^ b is the symmetric difference of a and b.
As far as I know, jQuery is a library written in Javascript, so all bit-wise operators should be available.

Max length of collatz sequence - optimisation

I'm trying to solve this MaxCollatzLength kata but I'm struggling to optimise it to run fast enough for really large numbers.
In this kata we will take a look at the length of collatz sequences.
And how they evolve. Write a function that take a positive integer n
and return the number between 1 and n that has the maximum Collatz
sequence length and the maximum length. The output has to take the
form of an array [number, maxLength] For exemple the Collatz sequence
of 4 is [4,2,1], 3 is [3,10,5,16,8,4,2,1], 2 is [2,1], 1 is [ 1 ], so
MaxCollatzLength(4) should return [3,8]. If n is not a positive
integer, the function have to return [].
As you can see, numbers in Collatz sequences may exceed n. The last
tests use random big numbers so you may consider some optimisation in
your code:
You may get very unlucky and get only hard numbers: try submitting 2-3
times if it times out; if it still does, probably you need to optimize
your code more;
Optimisation 1: when calculating the length of a
sequence, if n is odd, what 3n+1 will be ?
Optimisation 2: when looping through 1 to n, take i such that i < n/2, what
will be the length of the sequence for 2i ?
A recursive solution quickly blows the stack, so I'm using a while loop. I think I've understood and applied the first optimisation. I also spotted that for n that is a power of 2, the max length will be (log2 of n) + 1 (that only shaves off a very small amount of time for an arbirtarily large number). Finally I have memoised the collatz lengths computed so far to avoid recalculations.
I don't understand what is meant by the second optimisation, however. I've tried to notice a pattern with a few random samples and loops and I've plotted the max collatz lengths for n < 50000. I noticed it seems to roughly follow a curve but I don't know how to proceed - is this a red herring?
I'm ideally looking for a hints in the right direction so I can work towards the solution myself.
function collatz(n) {
let result = [];
while (n !== 1) {
result.push(n);
if (n % 2 === 0) n /= 2;
else {
n = n * 3 + 1;
result.push(n);
n = n / 2;
}
}
result.push(1);
return result;
}
function collatzLength(n) {
if (n <= 1) return 1;
if (!collatzLength.precomputed.hasOwnProperty(n)) {
// powers of 2 are logarithm2 + 1 long
if ((n & (n - 1)) === 0) {
collatzLength.precomputed[n] = Math.log2(n) + 1;
} else {
collatzLength.precomputed[n] = collatz(n).length;
}
}
return collatzLength.precomputed[n];
}
collatzLength.precomputed = {};
function MaxCollatzLength(n) {
if (typeof n !== 'number' || n === 0) return [];
let maxLen = 0;
let numeralWithMaxLen = Infinity;
while (n !== 0) {
let lengthOfN = collatzLength(n);
if (lengthOfN > maxLen) {
maxLen = lengthOfN;
numeralWithMaxLen = n;
}
n--;
}
return [numeralWithMaxLen, maxLen];
}
Memoization is the key to good performance here. You memoize the end results of the function that calculates the Collatz sequence. This will help you on repeated calls to maxCollatzLength, but not when you determine the length of the sequence for the first time.
Also, as #j_random_hacker mentioned, there is no need to actually create the sequence as list; it is enough to store its length. An integer result is light-weight enough to be memoized easily.
You can make use of precalculated results already when you determine the length of a Collatz sequence. Instead of following the sequence all the way down, follow it until you hit a number for which the length is known.
The other optimizations you make are micro-optimizations. I'm not sure that calculating the log for powers of two really buys you anything. It rather burdens you with an extra test.
The memoized implementation below even forgoes the check for 1 by putting 1 in the dictionary of precalculated values initially.
var precomp = {1: 1};
function collatz(n) {
var orig = n;
var len = 0;
while (!(n in precomp)) {
n = (n % 2) ? 3*n + 1 : n / 2;
len++;
}
return (precomp[orig] = len + precomp[n]);
}
function maxCollatz(n) {
var res = [1, 1];
for (var k = 2; k <= n; k++) {
var c = collatz(k);
if (c > res[1]) {
res[0] = k;
res[1] = c;
}
}
return res;
}
I haven't used node.js, but the JavaScript in my Firefox. It gives reasonable performance. I first had collatz as a recursive function, which made the implementation only slightly faster than yours.
The second optimization mentioned in the question means that if you know C(n), you also know that C(2*n) == C(n) + 1. You could use that knowledge to precalculate the values for all even n in a bottom-up approach.
It would be nice if the lengths of the Collatz sequences could be calculated from the bottom up, a bit like the sieve of Erathostenes. You have to know where you come from instead of where you go to, but it is hard to know ehen to stop, because for finding the longest sequence for n < N, you will have to calculate many sequences out of bound with n > N. As is, the memoization is a good way to avoid repetition in an otherwise straightforwad iterative approach.
In this task you are required to write a Python function,
maxLength, that returns two integers:
• First returned value: for each integer k, 1 ≤ k ≤ m, the
length of Collatz sequence for each k is computed and the
largest of these numbers is returned.
• Second returned value is the integer k, 1 ≤ k ≤ m, whose
Collatz sequence has the largest length. In case there are
several such numbers, return the first one (the smallest).
For example, maxLength(10) returns numbers
20 and 9
Which means that among the numbers 1, 2, 3,…, 10, nine has the
longest Collatz sequence, and its length is equal to 20.
In your program you may define other (auxiliary) functions with
arbitrary names, however, the solution function of this task
should be named maxLength(m).

Categories

Resources