Given any text, we can rank-order the words and also compute the probability of each word. The word that occurs most often, "the" in most English texts, is assigned the rank r = 1; the second most frequent word has rank r = 2, and so on. The probability, P, of a word is the number of occurrences of that word, divided by the total number of words in the text. Zipf's empirical law of word frequencies is
P is proportional to 1/r
That is, a plot of log(P) vs log(r) yields points falling (nearly) along a straight line of slope -1. As a first approximation, this law appears reasonably universal, holding for sufficiently long texts in languages besides English. A better approximation is
P is proportional to F/(r + V)1/D
This law can be understood in terms of scaling properties of a lexicographic tree, with D playing the role of a dimension.