Okay, so this is a result of my efforts to complete “Partying with the Primes: Part II” (see this blog for explanation. Or just scroll down a few days). Because I knew trying to get R to output some sort of number spiral would be quite an arduous task, I first decided to do a few more elementary visualizations of the primes. My first attempt led to today’s science blog.
Question: is there any sort of pattern to the spacing of prime numbers? That is, is there any sort of predictive sequence that demonstrates that the primes are “evenly spaced” (or not) amongst the other numbers?
I’d done a little bit of research on this topic prior to today, due to my 2009 NaNo (haha, we keep coming back to that, don’t we?), but it had been awhile, so I did a little bit more reading and came up with a few good sources to check out: here, here, and here.
Specifically, Zagier’s comment stood out to me: “there are two facts about the distribution of prime numbers of which I hope to convince you so overwhelmingly that they will be permanently engraved in your hearts. The first is that, despite their simple definition and role as the building blocks of the natural numbers, the prime numbers grow like weeds among the natural numbers, seeming to obey no other law than that of chance, and nobody can predict where the next one will sprout. The second fact is even more astonishing, for it states just the opposite: that the prime numbers exhibit stunning regularity, that there are laws governing their behavior, and that they obey these laws with almost military precision.”
So what’s a good way to visualize this stuff? My first attempt involved coding all prime numbers as “1” and all non-prime numbers as “0” and then plotting the results with 0 and 1 on the y-axis and the actual numbers (1 through whatever the highest number I chose was…I think it was 1,000), but that was a horrible mess of jagged lines and insanity, so I scrapped that and tried to think of a better way of looking at it.
In the end, I decided the best way to examine the instances of prime numbers amongst the non-primes was to plot the numbers by the numbers themselves. That is, for a given sequence of numbers (say, 1 through 10, just to make the explanation simpler) I would repeat each number by that number itself, create a new vector containing these numbers, and then plot the result.
Defunct code for better understanding:
This function says that for any number j in a given set of numbers (again, let’s say 1:10), output that number j times. So if I had the number 7, this function would give me a vector [7 7 7 7 7 7 7]’, or 7 repeated seven times. And if I ran it for all numbers 1 through 10, I’d get the vector
[1 2 2 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 10]’.
Of course, I couldn’t get this function to work but after screwing around a little bit more I finally figured out how to get this to work for larger sets of numbers, including sets just containing primes.
But what would plotting vectors like this reveal about any prevalence patterns for the primes? Well, let’s look at the plot for all numbers, shall we?
This plot is for all numbers from 1 to 1,000.
It’s pretty! Nice and smooth. So this can be said to be a plot for numbers that have a uniform or consistent pattern (all instances in this case occur one number apart, just because there’s one number difference between each instance; such is the nature of just listing the numbers 1 through 1,000).
Okay, that’s cool. So how about we look at a case where instances occur more “randomly?” In this case, I took a list of the numbers 1 through 1,000 and then went through and haphazardly deleted single numbers or large chunks of numbers so that I was left with a list that appeared to have numbers omitted at random.
Much choppier, eh? This can be said, then, to be a plot pattern for numbers that have an inconsistent or random pattern of deletion.
So what would a plot of the primes—say, all the primes below 5,000—look like?
So it’s obvious that this plot looks a lot more like the plot for numbers 1:1,000 and less like the plot involving random deletion. Interesting…I’d like to see what goes on with much larger primes, but unfortunately I can’t do that due to how huge the resulting vectors would be. R + large datasets = trouble.