[Facts] Re: Popularity Analysis Question
in reply to a message by ChanaRose
I once computed the entropy of the US names from SSA, and there is an increase from 8.2 bit in 1880 to 11.5 bit in 2015. The increase wasn't monotonic: At the end of the 1930ies the entropy temporarily decreased. Of course, the values aren't exact (because of the cut-off in the SSA lists at 5 babies per gender and year), but the trend is clear: Names become more evenly distributed and more informative over time. While the total number of name types seems to reach a culmination point in 2008, the entropy continues to rise after that year.
Replies
That's really cool!
What do you mean by entropy? Is that a measure of how many different names are used, or of how different they are from each other somehow?
What do you mean by entropy? Is that a measure of how many different names are used, or of how different they are from each other somehow?
After googling some more it seems it's a measure of the number of ways to arrange a system, so that should be similar to the number of different names. I think. Is that right?
Intuitively, entropy measures the deviation from a uniform distribution, it is maximal with a perfect uniform distribution, and goes to zero with a strongly peaked distribution. It also grows with the number of available names, but this dependence is rather weak and if one wants to get completely rid of it one can normalise the entropy, the resulting measure is that known as Shannon Equitability Index with a range from 0 (all in one peak) to 1 (perfect uniform distribution).