Calculating Prime Numbers

One of the areas of active research in mathematical computing this the calculation of prime numbers.

A few articles about that here and here

In any case, my interest in this topic scratches the surface at best I know enough about it to be aware of the Sieve of Eratosthenes
It’s a fairly old algorithm of finding primes. So I decided as a piece of Python practice to code my version by just looking at the pseudocode.

Rrom geeks for geeks: (https://www.geeksforgeeks.org/sieve-of-eratosthenes/)

Following is the algorithm to find all the prime numbers less than or equal to a given integer n by Eratosthenes’ method:

-Create a list of consecutive integers from 2 to n: (2, 3, 4, …, n).
-Initially, let p equal 2, the first prime number.
-Starting from p, count up in increments of p and mark each of these numbers greater than p itself in the list. These numbers will be 2p, 3p, 4p, etc.; note that some of them may have already been marked.
-Find the first number greater than p in the list that is not marked. If there was no such number, stop. Otherwise, let p now equal this number (which is the next prime), and repeat from step 3.

Here was my initial pass

def sieve(n):
	i = {}
	for x in range(2,n+1):
		i[x] = True
	#print(i)
	p = 2
 
	while p < n+1:
		temp = p
		while temp < n+1:
			temp = temp + p
			if temp < n+1:
				i[temp] = False
		p = p + 1
	out = []
	for j in i:
		if i[j]:
			out.append(j)
 
	print(out)
	print(len(out))

I tested it with the prime numbers up 100 first.

sieve(100)

The results were:

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
25

This was correct, but I felt unsatisfied with it. I took a few hours off and then realized what i didn’t like about it. It was using the algorithm but it was brute force at the same time. False was getting repeatedly set.

So my first correction was in the for loop and I updated it to

#sieve_firstcheck
			if temp < n+1 and i[temp] is True:
				i[temp] = False

I removed the list output so I could test a much larger number and it’s timing.

sieve_(10000)
sieve_firstcheck(10000)

and then

sieve(1000000)
sieve_firstcheck(1000000)

The results between the first and the updated check were interesting. The first test had the original code with faster results. I assumed this was due to the second check causing an additional instruction cycle.

1229
0.0295870304107666

1229
0.033675193786621094

The second test actually showed a speed improvement with the massive quantity of checks

78498
6.891968011856079

78498
6.706780910491943

I still felt that this wasn’t the improvement I was looking for. Then I realized I could put the check on a higher scope and came up with the following code

def sieve_check(n):
	i = {}
	for x in range(2,n+1):
		i[x] = True
	#print(i)
	p = 2
 
	while p < n+1:
		temp = p
		if i[temp] != False:
			while temp < n+1:
				temp = temp + p
				if temp < n+1 :
					i[temp] = False
		else:
			pass
		p = p + 1
	out = []
	for j in i:
		if i[j]:
			out.append(j)
 
	#print(out)
	print(len(out))

The results for this were much improved.

1229
0.013341188430786133

78498
1.760051965713501

For the first test it was 1/3 of the original timing and then for larger number it was actually 1/4 of the original timings. Clearly this was what I was looking for.