Why do 30% of all positive integers start with 1?

Thomas - what data - what set of data are you and others referring to? Can you be specific please - I assume you mean some of all of the test done in the field of statistics (not my major).

I shouldn't comment on what Amazon suggests to you - my maths degree was in Pure maths and half that was number theory - I did no stats till my masters.

I studied number theory, that's where the proof originated! I remember talking to some stats honour students - they'd never heard of the law at all!

We proved the law without refering to any real word data - as I said I studied Pure not Applied mathematics. I am surprised you keep trying to link and limit this back to real world. BTW the work done before Hills showed it was scale invariant if that makes you happier.

Hills law is derived from a base 10 number system, if we were using octal numbering the numbers still converge.

Read Hills proof - then you'd be in a more informed position.

0 Replies

Likewise bookmarking for learning

0 Replies

g__day wrote:

Thomas - what data - what set of data are you and others referring to?

Take your computer program for instance. _You_ decide to generate random integers in some particular way. Given your decision, Benford's law correctly predicts the distribution of the numbers thus generated. But nothing forbids me to generate random integers in some other way -- as a string of random digits, say -- and Benford's law will predict the outcome incorrectly. Pure number theory has no way of knowing whether your way of generating random integers is worthier than mine. The outcome entirely depends on your, or my, arbitary decision about how to randomize and how to sample.

g__day wrote:

Read Hills proof - then you'd be in a more informed position.

Okay, let's talk about procedure for a moment. On the internet, nobody knows you're a dog, but nobody knows you're a mathematician either. This means that whether you're a mathematician or not, you don't get to call yourself a mathematician and just use that authority to pronounce "The theorem is proven, case closed". You don't get to hand out reading assignments like the above one either. If you are unwilling to defend your point without bossing people around, then you have a problem. Discussions don't work that way around here.

Just my opinion, of course.

0 Replies

If I were to randomly pick a maximum subset of numbers: It stands to reason that a one for a starting number will be eligible for the starting number in every drawing from that particular subset. A two will be eligible less often because it cannot be eligible absent one. Likewise; a three will be even less likely because it can not be eligible absent the one and two's eligibility. And so on down the line until you get to the nine which is, of course, only eligible in approximately 1/9th of the drawings. Since a one is available in every drawing; depending on the randomly chosen Subset; it may come up anywhere from about 1 in 2 picks to 1 in 9 picks. That certainly gives the one a huge advantage in when you consider the 9 can only be eligible a maximum of 1 in 9 picks. Of course, every other number's frequency of eligibility will reside between that of the 1 and the 9. Since the number one begins every Subset AND is first to reappear, it stands to reason that it will always be the most frequently picked, regardless of how large the subset is to infinity. I couldn't write a formula to prove this to save my life (2 hours of trying) But if the above gibberish makes any sense at all; this should closed the case.
ONE can be eligible anywhere from 11.11% to 55.78% of the time. (as in 199) (999)
TWO can be eligible anywhere from 11.11% to 37.12% of the time. (as in 299) (999)
THREE can be eligible anywhere from 11.11% to 27.82% of the time. (as in 399) (999)
And so on. The above samples are rough estimates, because the more digits you have, the more accurate it becomes. (It will never move a full percentage point from this example).
This will occur with random numbers regardless of how big, because each time you add a digit, the lower numbers renew their advantage. Ifinity? Maybe not. But any random drawing of real numbers will have to abide by this rule (eventually :wink: ).
Now, will one of you smart guys please show me the formula?

0 Replies

g__day,

This axiom is about sets.

Thomas,

Can you explain what you mean about "randomly generating another way"?

;-)

0 Replies

Seems to me, if a "random number generator" is programmed to generate data sets that approximate "real-world" data, then it takes this law into account when it generates that data. That would be the point of the generator, no? And so what it generated should conform to the law.

Now, if I use the "random" number generator in MS Excel to test this, I don't find that any leading digit is overrepresented, at least not as extremely as the law suggests. This doesn't tell me the law is wrong, but that the numbers generated by Excel should not be used to simulate real data sets. (In fact, there appears to be no common distribution whatsoever, on glancing over the results.

0 Replies

flawed post deleted.

0 Replies

patiodog wrote:

Seems to me, if a "random number generator" is programmed to generate data sets that approximate "real-world" data, then it takes this law into account when it generates that data. That would be the point of the generator, no? And so what it generated should conform to the law.

I'm not sure what you are saying but a truely random generation of sets will follow this law. Bill explained it.

I'm thinking someone should post the train station riddle here.

0 Replies

Craven de Kere wrote:

I'm not sure what you are saying but a truely random generation of sets will follow this law. Bill explained it.

Could you please define "Truly random"?

0 Replies

Thomas wrote:

Here is one example:

Step 1: Create all integers between 0 and 9
-> {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
Natural numbers generated so far: 9; distribution of first digits: even.

Step 2: Create all integers between 10 and 99
-> {10, 11, 12, .... 97, 98, 99}
Natural numbers generated so far: 100; distribution of first digits: even.

Step 3: Create all integers between 100 and 999
-> {100, 101, 102, ..., 997, 998, 999}
Natural numbers generated so far: 1000; distribution of firstdigits: even.

and so forth. In the n-th step, you have generated all natural numbers with n digits. After every step, the number of numbers generated so far is 10^(n+1), and the distribution of first digits is even. Let n run from 1 to infinity, and you have produced all natural numbers there are.

Sure, this outcome depends on my ideosyncratic choice of generating integers and of when to calculate the distribution of first digits. But so does g__day's. As a matter of number theory, there is no good reason why one way of generating and sampling is better than the other. That doesn't make Benford's law wrong, but it makes it a law about generating and sampling integer values, not about any intrinsic features of natural numbers.

Sorry Thomas, I was being facetiously and thought the emoticon illustrated that. I understand that very well and was commenting on the fact that it's not really random that way.

True randomness is not limited. What g__day isn't considering is that any 'random' calculation a computer is capable of is a randomness within limits (that can also be random but also defined with limits).

This axiom is about sets. And it's true when the set limitations are calculated in random manner (as in "real life"), not when the interger itself is truely random (and there is no mechanism that exists that can simulate truley random numbers).

0 Replies

Thomas wrote:

Could you please define "Truly random"?

Without influencing limitation. It's a paradox that doesn't exist in our universe.

So any "randomness" is "random within the limitations delienated".

We're saying the same thing but in different ways.

0 Replies

Ah, so you've responded to it before I retracted it. Of course my sequence is anything but random, so it doesn't even illustrate the point I was making.

0 Replies

We're getting closer to being on the same page, and I think I might have even misunderstood you earlier.

My position: This axiom is about sets. That's the only way it ever makes sense because it relies on sequence.

You can randomly define any number, and from that point down the set is created (the first limitation delienated is 0).

When they are randomly defined they will follow this axiom. Even if there is truely random generation of the ceiling.

Randomly defined numbers will not fit this axiom (in that they themselves will not have the 1 be more prevalent because sequence in a set is needed to render it true) without the limitation inherent to the universe that randomness doesn't exist so all "randomness" we illustrate is within an already defined set (defined by CPU and RAM for a computer e.g.).

But even in truely random theory, sets will follow this rule as long as it's based on random ceiling and with preserved sequence.

So this is why integers in real life as we experience it (which is always limited to sets) follow this rule.

They are always in sets because humans are incapable of experiencing and comprehending infinity.

0 Replies

Thomas

I also used Bruce Ellis's of Bell labs string of bits from noise function (samples if sum of key strokes occuring in 10 seconds across a University departments are even or odd to determine if a bit in a number is zero or one, bit at a time to generate large numbers - generated a table of big numbers that way - analysed table - concurs to this theorem - so there goes your POV Smile

)

Further holding a view that this LAW is unproven means you doubt what the University of Georgia publishes about its maths professors. You should read the answer at its source if you want to see a sterling proof. Do you contend that the proof Hill came up with is incomplete or inaccurate, or is there no rational reason for your views?

Also you method is actually the first process attempted then shown to be false! Why when it looks so good? Well as you change the group size you get different answers for each size as you let N approach infinity. So if you generalise you formulea to be limited from 1 -> a^N(n + 1) for all values of a this formulea should converge to your answer if your answer is uniquely true.

But for different values of a these types of summation converges to all different results, showing by contradiction you can't test it this way.

Craven

Its only about sets in that it was originally posted as a Number Theorum axion to describe the entire infinite set of positive integers. It was never, every taught to apply on finite subsets by Pure mathematicans, although that's where applied mathematicians quickly took it to make use of it.

When I spoke of Generators I was speaking of the the term Generators of a set - like the way our 4 DNA components G A T C generate our entire DNA strings. From number theory you can apply a theorum to the generators of a set to show how it affects the set as a whole.

Counting photons recieved per period on a collector to set bits in number is a pretty good random number generator. And I made no statement as to what my assumptions about randomness of random number generators are - so its incorrect to assume I am unaware of their limitations or distributions.

* * *

You need to forget randomness from this discussion and consider the entire set of positive integers to understand the original theorem. My small programming example has had the unfortunate side effect of making folk think statistical analysis rather than apply number theory correctly to an infinte set of numbers.

OCCOM BILL

You'd be better to argue most random numbers that don't have a real world noise function are taking advantage of an illconditioned function to generate big numbers with a specificed distribution - then modding them by primes to get remainders - its the underlying distribution function and the effects of modding by primes you should examine to investigate bias.

The original Number theory analysis looked at all numbers from group P1 = {x: in 1 -> N} as N heads to infinity and counted digit occurence in P (i) for digits 1 -> 9. Then this was check against the group P2 = {x: in 1 -> 2N} as N heads to infinity and then P3 = {x: in 1 -> 9N} as N heads to infinity etc to show as you had convergence for digit(i) in each group - but the answers were all different for each group!

Then you consider P = {x: 1-> round(log(aN))} where a = any constant as N heads to infinity. This converges to the same numbers for all values of a to digit(i) occurs log((i+1)/i).

No distribution functions there!

0 Replies

g__day,

It needs sets (even if the ceiling is randomly designated) because it relies on sequentialism.

Any random number generator you envision will have inherent factors that will serve as a ceiling.

0 Replies

Craven

I'd call it a group rather than a set, but an infinite one that must be the group of all positive integers.

Forget random number generators once and for all please! You don't use random numbers generators to solve Fermat's last theorem (you extend the Taniyama-Shimura conjecture) any more than you do for this problem!

0 Replies

Hi, i REALLY must read this thread.

0 Replies

g__day wrote:

Forget random number generators once and for all! You don't use random numbers generators to solve Fermat's last theorem (you extend the Taniyama-Shimura conjecture) any more than you do for this problem!

g__day, it's not sounding like one for you..

My mention of the inherent ceiling or limitation to any "random number generator" was a response to you when you yourself mentioned what you think is a "good random number generator".

g__day wrote:

Counting photons recieved per period on a collector to set bits in number is a pretty good random number generator.

I'm perfectly willing to "forget" it but don't think you should consider it out of the ordinary for your comments about them to be commented on.

0 Replies

Hmm... I'm starting to understand. I am curious though, do you only define the chance as 30% for starting with a one, as the average in random sets, or the average between every given set (or group); or do you apply the 30% to the average of all numbers, regardless of grouping? It seems I've heard both statements, and I am still a bit confused if it is the latter case. So, simply, does this only apply to the average of randomly selected numbers, or all numbers in general...

I think in trying to word that I figured out the answer...

0 Replies

Craven - apologies

I wish I could slap myself in the mouth for ever mentioning random number generators!

Its an interesting aside I agree, but it distracts from the heart of the matter I now realise.

* * *

On the subject of random number generators - one that use a mathematical function deliver psuedo random numbers. Ones that use a noise function (e.g. counting photons) pick up the distribution function of the underlying noise pattern. Combinations of the two still bear these traits.

Saying any function is a good random number generator is really saying it generates very psuedo random numbers compared to other psuedo number generators.

Lets explore this in a new thread?

SCoates

Its the precise result for the infinite set of all positive numbers. It observing an infinity. It also can be shown to apply for many sub sets - but that's the applied phenomia and use that Thomas and others are legitimately discussing.

0 Replies

Why do 30% of all positive integers start with 1?

Related Topics

Quick Links

My Account

able2know