The Search for the Best Base Part 1 – Understanding the Problem

A “Basic” Question

The way we write numbers, the positional system, is a great system. I’m sure you know, but each digit represents a different multiple of a different power of ten – “634” means 6 × ten^2 + 3 × ten^1 + 4 × ten^0. This system allows to write any number you want in a reasonably small (logarithmic) number of digits.

But this raises a significant question: why are we using powers of ten, instead of any other number? For example, if we had used powers of nine instead of ten, the same number above would be written “774” – even though the set of digits, and the specific digit pattern we use to write numbers, is different, the system works the same regardless of what number we base the system on (called the “base” or “radix”).

The reason we use ten is probably because most people have ten fingers; therefore in our system most people can count from “0” to “10” using their fingers. This isn’t that great of a reason – as you will see, the base of our number system affects much more than that.

The Conventions of this Series

Before we look into number bases, I need to explain a few details about how specifically I will be writing numbers in this series, so that no one is confused.

Writing Numbers

Base N normally uses digits from 0 to N-1. For digits above 9, I will be using uppercase letters: A=ten, B=eleven, C=twelve, and so on. This strategy only works for bases up to thirty-six, but that will be enough for what we’re doing. For example, base thirteen will use the digits 0-9, A, B and C. If a number gets long, I will use non-breaking spaces (U+00A0, ’ ‘) to separate the digits (though ordinary spaces are also fine).

Positional notation also has a way to write fractions using negative powers of the base. This is normally called “decimals,” but because that’s just the name of our system I will call this “digitals” instead. I will use the period for the digital point.

However, some fractions have an infinitely repeating pattern when in digital notation. For example, in base ten:

To represent this, I will put a second digital point before the start of the repeating sequence, and only write it once. Any digits between the two digital points do not repeat. There’s no need to have a symbol to represent the end, because there can never be any digits after infinite repetition. With this convention, the fractions above would look like:

Nice and compact! With this notation, every rational number can be written with a finite number of symbols.

Specifying the Base

To specify the base that a given number is in, I will write the base’s as a single digit, then a number sign (’#’), then the number. For example,

If the base of a number is clear from context, or if the number is a single digit (which has the same value in every base), I may omit this indicator.

If one of these prefixes is put in square brackets (e.g. ‘[A#]’ for base ten), it can be used without a number to apply to an entire row, column, heading section, etc.

Names of Bases

As you will see later, the factorization of a base is very important to how it works. Therefore, to name bases in a neutral and useful way, I will be naming them using a multiplication-based system inspired by this system by jan Misali:

Value [A#] Name Prefix
2 Binary bi-
3 Ternary tri-
4 Quaternary quadr(a)-
5 Quinary pent(a)-
6 Sezimal sez(i)-
7 Septenary hept(a)-
8 Octal oct(a)-
9 Nonary enn(ea)-
10 Decimal dec(a)-
12 Dozenal doz(a)-
14 Fortal fort(a)-
15 Twinnal twin(a)-
16 Tessal tess(a)-
+1 un-
-1 mal-

To avoid confusion with existing base names, “-decimal” changes into “-gesimal” when a multiplying prefix is added before it.

The system works as follows:

As for how I got these names:

So, Why does the Base Matter?

Before we start comparing bases, we need to figure out what changing the base even does! As it turns out, our choice of base affects a lot.

Number Length

If you first look at a number in many different bases, one of the first things you will notice is that the smaller the base, the more digits are needed to write the same number. For example, here is one million in several different bases:

However, smaller bases need less distinct digits, so it’s possible to write them using smaller digits, without the need for complex, hard-to-distinguish digits (unlike the higher bases, which already use confusable digits like 0O1I2Z5S). This video about binary offers a great system to deal with its length. In its suggested system, 0 is represented with a short vertical line, which I will type as ‘.’, and 1 is represented with a long vertical line, which I will type as ‘l’. Because I’m using the period for 0, I will have to use the comma for the digital point. In this system, a million is ‘llll .l.. ..l. .l.. ….’, compared to ‘1 000 000’ in decimal. You can see how this system eliminates the visually large number problem.

So, are smaller or larger bases better? In order to figure that out, we’d need to figure out how big the digits are in their ideal font. Imagine you already had a digit system, and you wanted to extend its range. The simplest way to do this would be to add another symbol (say a . or a l) beside it, with the l representing a new range of digits. This new system would have twice as many digits as the original, and would add a constant amount of space. If adding a constant amount of space multiplies the base range by a constant amount, the ideal size of the digits must be proportional to the logarithm of the base (the base of the logarithm doesn’t matter).

The number of digits in the number n in base b is ⌊log(n, b) + 1⌋, where log(n, b) is the base-b logarithm of n. Therefore, the visual size of number n in base b is proportional to log(b) ⌊log(n, b) + 1⌋. For very small numbers, this gives an advantage to small bases on average. To see why, think of the binary again: each pair of binary digits is a digit in base 2^2=4, each trio of binary digits is a digit in base 2^3=8, etc. These notations should be the same, however in the number lllll, quaternary, octal and tessal will each need to add unnecessary extra zero digits at the start to make the number a whole number of pairs/trios/quartets/etc., since you can’t have a fraction of a digit! In this notation, binary is always at least as compact as any higher power of 2.

There’s another advantage you get for small numbers in small bases: the first digit of a nonzero number is never zero. In binary, the first digit of a nonzero number is always one. Some systems, like the floating-point notation computers use to store fractional numbers, take advantage of this by omitting the first digit and creating a special digit combination for zero. This is not a good idea for normal notation, but in certain cases this gives another advantages for small numbers in small bases.

For large numbers, ⌊log(n, b) + 1⌋ ≈ log(n, b), so the visual size of n in base b is approximately log(b) log(n, b) = log(b) [log(n) / log(b)] = log(n); in other words, it doesn’t depend on the base.

In essence, the relation between base and number length is:

Counting

In every base, the algorithm for counting is the same simple algorithm:

This means that in smaller bases, you need less state per digit, which makes it possible to count with simpler devices. That would make binary counting the simplest type of counting.

However, if you combine multiple of these “digit devices,” you will get a counter that counts up to the product of their digit. This still works if the digits are different, so for example, you can create a ten-counter by combining a two-counter and a five-counter. The only counters that can’t be split into smaller parts are prime numbers. Therefore, the simplest bases to count in aren’t necessarily the smallest bases, they’re the bases with the fewest and smallest prime factors. Binary is still the simplest, but bases like dozenal are simpler than bases like septenary.

Addition & Subtraction

When adding or subtracting numbers, we normally memorize the sum or difference of each pair of digits (the addition/subtraction table), then add or subtract multi-digit numbers digit by digit. This same method works in every base, except that the addition/subtraction table is different in every base.

There are two effects the base has on this process:

So, it seems like we’re in a tradeoff: small bases require less rote memorization but increase the number of digits, while large bases do the opposite. However, remember what I said earlier: every base contains its powers within it. In really small bases like binary, you can add as if you’re in a larger power-of-2 base by simply adding multiple digits at once! It gets better – since each of these multi-digit addition tables is just the sums of all pairs of digits up to a number, each of these powers’ addition tables contains the previous one – no need to relearn sums you’ve already learned.

For addition and subtraction, smaller bases are better than larger bases. They allow you to pick where you are on the tradeoff between memorization and number of digits, and they allow you to learn addition and subtraction gradually.

Simple Operations Affected by Base-Digit Relationships

For the operations so far, smaller bases are better. However, some other operations are more complex, since they are different for different types of numbers. In order to talk about these more complex operations, we will need to give some names to the types of number in a certain base. For the number n in the base b:

Multiplication

Like addition, multiplication requires you to memorize a multiplication table, which is different in each base. Multi-digit numbers can be multiplied using the same method in every base – the only difference is the products of individual digits.

Like addition, the size of the base b multiplication table is b², but unlike addition, not all rows are equal – each row of the table will have a different pattern depending on the type of digit:

In general, smaller bases will have more of these simpler digits. The smallest base, binary, has only the trivial 0 and 1, while ternary, quaternary and sezimal have only one nontrivial totative. However, that doesn’t mean that smaller bases are always better than larger ones – some bases like sezimal and dozenal have more factors than the nearby bases, and in prime bases every digit besides 0 and 1 is a nontrivial totative.

Also, the last step of the multiplication algorithm is addition, and the number of digits you will need to add is equal to the product of the number of digits in the two multiplicands, so the tradeoff between memorization and effort is still there, but is more tilted in the direction of larger bases.

However, as in addition, smaller bases have the advantage that you can multiply as if you’re in a larger base by working with multiple digits at a time, and it has the same advantage of graduality. However, doing this removes the advantage of more simpler digits.

Division

Long division is better in smaller bases than in larger bases, even for long-term effort, because at each step, you have to determine which multiple of the divisor to subtract, which requires (b - 1) comparisons in base b. This makes the effort approximately proportional to the base itself, which outpaces the logarithmic effect of longer numbers.

However, there are two special cases in which long division is not necessary:

So in order for division (the most difficult of the four basic operations) to be as simple as possible, we once again want the base to be as small as possible, but we also want it to have as many factors and regulars as possible. The number of regulars can be maximized by having as many distinct prime factors as possible, but adding more non-distinct prime factors will reduce the exponents of b^e for some of these regulars. Therefore, we have a dilemma – we want small bases, but we also want to have many prime factors, which are only possible in large bases.

Here is a table of the smallest number (n) with a certain number of factors (f), in decimal. However, the smallest number with 5 factors (A#16) is larger than the smallest number with 6 factors (A#12), so I’ve excluded factor counts like 5 that are beaten by a larger count:

f n
2 2
3 4
4 6
6 12
8 24
9 36
10 48
12 60
16 120

The same relationships also determine what the digital notation of fractions with denominator d will look like:

Divisibility and Remainders

Another important operation is determining the remainder a number n of division by a number d (denoted “n % d”). Note that this also includes divisibility, since n % d = 0 if and only if n is divisible by d.

The complexity of this operation depends on the relationship between d and the base b:

Adding unique prime factors creates more regulars, which is good for divisibility testing (though doing so increases the size, which increases the amount of memorization for all existing tests). Adding non-unique prime factors simplifies tests for factors divisible by powers of that prime, but has diminishing returns. For example, here’s a table of the number of digits you need to test for each power of 2 (top) in bases with a certain power of 2 as their factor (left):

2 4 8 16 32 64
1
2 1 2 3 4 5 6
4 1 1 2 2 3 3
8 1 1 1 2 2 2
16 1 1 1 1 2 2
32 1 1 1 1 1 2
64 1 1 1 1 1 1

For example, in order to test for divisibility by 32 in a base whose largest power-of-2 factor is 4, you will need to look at the last 3 digits. ‘∞’ means you need to check every digit of the number.

Going from odd to even is a massive improvement that is practically necessary for a base to good. Going from even to divisible by 4 is still a big improvement, approximately halving the number of exponents needed. Going from divisible by 4 to 8 is a much smaller improvement, and 8 to 16 or higher does almost nothing.

However, don’t forget that every factor you add increases the base’s size, which increases the number of combinations you’ll have to memorize for all the other tests! If p^n is a regular, and the base has m factors of p (i.e. it is divisible by p^m but not p^(m+1)), then the number of combinations you will have to memorize in order to test p^n is b^⌈n/m⌉ / p^n. On average, each new power of p multiplies the number of combinations by b^(1/m) / p (if p is prime). Here are some examples of these combination multipliers for certain primes and bases [A#]:

Base 2 3 5
2 1.00 N/A N/A
3 N/A 1.00 N/A
5 N/A N/A 1.00
6 3.00 2.00 N/A
10 5.00 N/A 2.00
12 1.73 4.00 N/A
18 9.00 1.41 N/A
20 2.24 N/A 4.00
24 1.44 8.00 N/A
30 15.00 10.00 6.00
60 3.87 20.00 12.00
120 2.47 40.00 24.00
360 3.56 6.32 72.00

As the table shows, what matters most in reducing the number of combinations you have to memorize is the ratio of prime factors. Adding a prime factor simplifies the tests for that prime, but complicates the rest, and all bases are perfect at testing themselves and their powers. This multiplier is useful enough that I’ll give it a name – the “regular complexity”. Not only does it determine roughly how many combinations you need to memorize in order to do divisibility/remainder testing, it also roughly determines the size of the number you’ll need to multiply by in order to divide by powers of this number (when you divide by multiplying and shifting the decimal point left).

Rules for Nontrivial Totatives

Here are all the rules I know for nontrivial totatives:

Remember that each base contains its powers, via multiple digits, so you can use any of the above rules for b^e ± a by using sets of e digits at a time. For example, in decimal 13 is a factor of 10^3 + 1, so you can use an alternating sum of sets of 3 digits (e.g. 123456788 → 123 - 456 + 788). This isn’t that useful in decimal, but other bases, particularly smaller ones, can make use of this.

Another rule for factors of b^e + 1 is the Reverse Split, Promote, Discard test. In order to do this test, you must memorize all multiples of d up to b^e. Then, follow the following procedure:

Here is an example (dozenal, n = C#23AA93854, d = 5, e = 2):

Dozenal multiples of 5 to 100:
00 05 0A 13 18 21 26 2B 34 39 47
50 55 5A 63 68 71 76 7B 84 89 97
A0 A5 AA B3 B8

              23AA93854
Split       : 23 AA93854
Promote (-2): 21 AA93854
Mirror  (-2): 21 A8 93854
Discard     : A893854
Split       : A8 93854
Promote (+2): AA 93854
Mirror  (+2): AA 95 854
Discard     : 95854
Split       : 95 854
Promote (+2): 97 854
Mirror  (+2): 97 87 4
Discard     : 874
At this point, in order for the Mirror step to work properly,
we need an even number of digits.
Split       : 08 74
Promote (+2): 0A 74
Mirror  (+2): 0A 76
Discard     : 76
76 is a multiple of 5, so so was the entire number.

This is a slightly modified version of a test invented by members of the Dozenal Society of America (modification by me). Unlike the original, it can be used to find remainders. Information about the original can be found at the following link: Dozenal Divisibility Rules

This test can also be used for factors of b^e - 1, except that in the third step you must do the opposite of whatever you did, instead of the same thing (i.e. if you added 3 to the split digits, subtract 3 from the next set).

Modular Power Sequences

There is also one rule which works for absolutely any divisor, the Modular Power Sequence rule. To use it, you must find the modular power sequence of d in base b. Start with 1, and get each new number by multiplying the previous by b, and taking the remainder mod d (if d is small, this is doable mentally). For nontrivial totatives and semitotatives, this sequence will eventually repeat (it actually follows the same repetition pattern as the digital of 1/d). Some examples of MPSes:

Appendix: Why the Remainder Rules Work

Conclusion

The two simplest rules to work with are the rules for b-1 and b+1, and these numbers will also have the simplest reciprocals, which makes it seem like it is important to find bases with small, important numbers as factors of their neighbours. The benefit of this should not be overstated, because the smaller (more useful) a (prime) factor is, the less benefit it will get from this. For example, take a look at the situations of the first 6 primes:

So, the more useful a number is, the less benefit you will get from having it as a neighbour. Neighbours also don’t change the fundamental difference between a regular (remainder requires last few digits and is instant if combinations memorized) and a totative (remainder requires the entire number, and needs calculation). Since there are only two neighbours, which neighbours a base has doesn’t matter that much, especially if those neighbours are prime.

What does matter is how many factors, and specifically prime factors, a base has, since unique prime factors add more regulars to a base, and non-unique prime factors simplify some of them. A base’s size also matters, since smaller bases will have less memorization for equivalent rules that require it.

Variants of Bases

Before we finish, we should look at variations on the positional notation system. There are many ways to vary the formula, but I only know of two of them that have a chance of improving the already very great formula.

Mixed-Base Notation

The first modification is one that has been used by many human civilizations to enable the use of bases twenty and sixty.

Instead of having each digit’s value be a fixed number (the base), we allow ourselves to alternate multipliers. For example, in the mixed base 6-by-A, the number “1 23 45” means “1×(6×A×6×A) + 2×(A×6×A) + 3×(6×A) + 4×A + 5”. You can also treat this as if it were base 6×A=A#60, but each digit were made of two characters, forming a number in base ten, or whatever the rightmost digit has a base of (“1 23 45” = 1×A#60² + A#23×A#60² + A#45).

I won’t go over everything – this article is long enough already – but in summary, the advantages and disadvantages of this system are:

There are two things you can do which makes mixed bases better to use:

Bases A#60 and A#120, by coincidence, both have mixed versions that do well on both of these criteria (6-by-A#10 and A#10-by-A#12 respectively). Most other mixed bases do not do as well as them on this.

In my notation, I will indicate mixed bases by using multiple digits, in the order that the sub-bases are in. For example, 6A# represents base 6-by-ten (100-1 = 59), while CA# represents base twelve-by-ten (100-1 = B9).

Balanced Notation

Another modification we can make to our system is to change the set of digits available. Normally, digits in base b go from 0 to b-1, but there are other ways to do it that still work. For example, you can create an injective base by going from 1 to b. In this system, every combination of digits leads to a distinct number.

However, the most interesting and useful way to change this is to allow negative digits, giving the system digits from -⌊b/2⌋ to ⌊b/2⌋.

In my notation, I will indicate negative digits using lowercase letters, where z=-1, y=-2, x=-3, and so on. I’ve put them in this order so that alphabetical order is numerical order – and, for example, it’s easy to see that 1z is greater than 1y. Because balanced bases don’t change the value of any digit string, only the set of allowed digit strings, they don’t need a separate indicator from their unbalanced counterpart, so “A#” can mean both regular and balanced decimal without any confusion.

As an example of this notation, here’s what counting in balanced nonary looks like: 0, 1, 2, 3, 4, 1w, 1x, 1y, 1z, 10, 11, 12, 13, 14, 2w, 2x, 2y, 2z, 20, …

A summary of the advantages and disadvantages of this notation:

In addition, even balanced bases are a bit more complicated than odd bases. You will have to use b+1 digits, from -b/2 to +b/2. This means that, for some numbers, there will be multiple ways to write the same number. For example, both A#5 and A#1t represent the same number, and both are valid numbers in balanced decimal. Only numbers with the “halfway digits” -b/2 and +b/2 will have this problem.

Most of these can be removed by the following rule, which is necessary in order for rounding by truncating to work properly: If a halfway digit has a nonzero digit to its right, it should have the opposite sign of the leftmost such digit. For example, A#5t should be preferred over A#1tt. For halfway digits without such digits, either form can be used, and which form is used depends on how you round numbers at the halfway point like A#0.5.

Summary

In short, there are two properties that determine whether a base are good or bad are:

Here is a recap of how each operation is affected by this dilemma:

Operation How to optimize
Counting Use small prime factors
Addition/Subtraction Reduce base size
Multiplication Reduce base size, maximize factors
Division Reduce base size, maximize regulars
Remainder Reduce base size, maximize regulars

Part 2 – Base Evaluation Criteria
Send me feedback about this post