I got a lot of great feedback after posting about excellent
numbers on Hacker
News. An anonymous user,
going by the nickname *Someone*, noticed a way to re-write the equation
for excellent numbers.

Following this method to its conclusion yields explicit solutions for
d-digit excellent numbers in terms of the factors of 10^{d} - 1.

Thus, if you want to find all 42 digit excellent numbers, you need to
first factorize 999,999,999,999,999,999,999,999,999,999,999,999,999,999.
Then, you can go through those factors looking for excellent numbers
instead of the 10^{d/2}-digit numbers.

In this post, I am going to go over Somone's insight.

Please consult my previous post for notation and definition.

A d-digit excellent number n ≡ a10^{(d/2)} + b satisfies:

b^{2} - a^{2} = a10^{(d/2)} + b

Set K ≡ 10^{(d/2)} and rearrange:

b^{2} - b = a^{2} + aK

*Someone*'s insight is to re-write the left hand side as:

(b - 0.5)^{2} - 0.5^{2}

and the right hand side as:

(a + K/2)^{2} - (K/2)^{2},

and multiply both sides by 4 to get:

(2b - 1)^{2} - 1 = (2a + K)^{2} - K^{2}.

Moving terms involving a and b to the left hand side, we get:

(2a + K)^{2} - (2b - 1)^{2} = K^{2} - 1.

Now, this is an equation of the form x^{2} - y^{2} = c. That means we can rewrite it as:

(2a + K - (2b - 1))(2a + K + (2b - 1)) = K^{2} - 1.

Rearranging variables one more time, we get:

(2(a - b) + K + 1)(2(a + b) + K - 1) = K^{2} - 1.

Note that K^{2} is just 10^{d}. Consider any two integer factors of 10^{d}-1, denoted p and q. Then, we have:

(2(a - b) + K + 1)(2(a + b) + K - 1) = pq.

This can hold in two ways. Suppose:

2(a - b) + K + 1 = p

2(a + b) + K - 1 = q

hold. Leaving the unknowns on the left hand side yields:

a - b = (p - (K + 1))/2

a + b = (q - (K - 1))/2.

We can solve this system using simple substitution:

a + a - (p - (K + 1))/2 = (q - (K - 1))/2

2a = (q - (K - 1))/2 + (p - (K + 1))/2

a = (p + q - 2K) / 4

Substituting back to get b, we have:

b = (p + q - 2K)/4 - (p - (K + 1))/2

b = (p + q - 2K - 2p + 2K + 2)/4

b = (q - p + 2)/4

Since both a and b must both be positive, we need to only consider
factors (p, q) of 10^{d} - 1 such that q > p - 2.

Also, both a and b must lie in the interval (K/10, K), i.e.
(10^{(d/2)-1}, 10^{d/2}).

Those conditions impose additional conditions on which factors of 10^{d}- 1 can give rise to excellent numbers: (p + q) > 0.6K, (p + q) < 3K, (q - p) > 0.4K - 2, and (q - p) < 4K - 2.

Looking at this linear constraint system, I feel like there is a neat algebraic structure behind the arithmetic, but it is not apparent to me right now. But then, integer programming is NP-hard, so it may not help much.

The important thing is that this solution transforms the search for excellent numbers into a constrained integer factorization problem.

A 38-digit number requires a little less than 127 bits. Previously, we were finding excellent numbers by searching among 19 digit numbers which meant we could store each part of the number in a 64-bit unsigned integer, and use a few 128 bit arithmetic operations.

In this case, because the integer factors of 10^{38} - 1 are involved,
we'd need 128-bit unsigned integers to hold those if we needed to
consider all of them, but the (p + q) < 3K constraint implied by the
bounds on `a` means we don't need that if we are careful about how we
iterate over the prime factors.

Luckily, there is a project dedicated to factorizing numbers of the
form (10^{d} - 1)/9. Apparently, these
numbers are called
repunits. I guess we need
factors of repnine numbers, i.e. a repunit multiplied by nine.

For example, 111,111 has the prime factors 3, 7, 11, 13, and 37.
Therefore, 999,999 has the prime factors 3^{3}, 7, 11, 13, and 37. Of
course, we need all integer factors of 999,999. That means going through
all the different ways you can multiply these factors, but that is still
a much smaller problem space than what one deals with when searching
through 10^{d/2}. So, *provided you know the factors of 10 ^{d} - 1*, you
can find all d-digit excellent numbers much more quickly.

It feels like cheating, because, let's be honest, it is. Now, in a business context, where money is on the line, my first recommendation would be to go ahead and use the factors someone else found. In this case, it takes some of the fun out of the challenge.

I am hoping that the connection between d-digit excellent numbers and
the factors on 10^{d} - 1 will yield some more insight into the structure
of excellent numbers. Otherwise, it is good to know that the problem is
equivalent to integer factorization.

This was a useful exercise in that it really drove home to me the fact
that one really may not need GMP for integers that can fit in 256 or
fewer bits. With a bit of up-front analysis, one can save a lot of time,
especially using gcc's `__int128`

. I learned that Visual C++ has only
clumsy support for 128-bit integers, and gcc seems to do a better job
than clang at optimization.

Not all problems involving large numbers are this straightforward, so just these observations alone were worth it.

I really appreciate *Someone*'s insight. This shows the value of sharing
your experience with a problem with others.

At one point, brian d foy did notice patterns involving the prime factors of the number of digits, but the explicit solution above did not occur to either of us.

While all that is involved here is high school math, manipulating the equation this way never occurred to me. I really thought I could gain something from trying to express the problem in terms of Vandermonde matrices … Goes to show the value of sticking longer with the simpler approaches.

It was a fun experience. As I mentioned before, I am not one to be normally interested in quests to find another prime, another digit of π, another factorization, but, for some reason, this one really was different.

It is proof that working on puzzles can be fun.

You can comment on this post on r/programming.

PS: Matthew Arcus takes this idea to the limit. It looks like he first mentioned this idea back in March 2015.