Some Derivations and Criticisms
|
Up
|
Simpson Index
with
=
and
is
the abundance of species
derivation: the probability to choose a specific species
twice is
and
thus the probability that any two chosen specimens belong both to
the same species
is the sum
over all species
.
criticism: none, it is easy to understand and, as a true
probability, bounded between 0 and 1. Of course we shouldn't
interprete the "choosing" of a specimen as catching one (this has
definitely a different probability), but rather we'll have to say
that the sample is representative for the population, what is
still questionable but difficult to avoid.
Rarefaction
where
is the
number of specimens for species in a
population,
is the total
number of specimens and the size of
the sample (in specimens)
derivation:
the expectation value for the number of species to choose from an
abundance vector is the sum over the probabilities of all species i
in the population, which in turn is one minus the probability to
miss the species i. The probability to miss a species i is equal to
the number of possible combinations in the sample without that
species
divided by the number of possible combinations with that
species .
criticism: we have to assume both, that the abundance
distribution of our measured sample is representative for the
distribution in the population and that choosing a specimen from the
population is a Laplace experiment (all elemenary probabilities are
equal). Both assumptions don't hold, and so applications of a proper
rarefaction require that we know how close we are to saturation in
advance, and thus are circular. What is especially dangerous, is
that every rarefaction curve shows some sort of convergence, like in
this study. And ceterum censeo, I don't see how Monte Carlo
methods like Jackknifing make sense, if we have an analytical
formula.
Entropy
,
again with
=
and
is
the abundance of species i
derivation: the basic experiment of thought in
classical thermodynamics starts with a set of distinguishable
species kept separately in different volumes, which are allowed to
mix adiabatically (no heat is exchanged) by opening some valve. As
additionally the total volume is constant, the difference between
internal energy of final and initial state is zero for the whole
system:
,
with
being internal energy, temperature, entropy, pressure, and volume in
turn. Using the ideal gas law , with
n and R beeing number of moles and the ideal gas constant this means
,
and thus the difference in entropy for species
is
or as entropy change per mole
.
To get the total entropy of mixing we simply sum up over all
species:
,
which differs from the more metaphorical use above only by the ideal
gas constant
which is set
to 1.
criticism: the derivation above should make clear that the
entropy of an ideal mixture has nothing to do with the actual
process of mixing: we get the same result, if we just expand the
single (ideal gas) species separately into volumes
.
What is actually causing the calculated increase of entropy is
simply the increase of phase space available to every species. Whith
interacting species, like polar molecules, this formula doesn't make
much sense and if we deal with a chemical reaction, the entropy of
an ideal mixture has usually a negligible contribution. And flies do
react with each other... (The same arguments hold if we choose
Boltzmann's microstates for a derivation, or if we rename the term
entropy to the thereby defined "information content".)
Evenness
= ( where
is again
the probability to choose species in the
population and is the
probability to choose species if all
species where equally abundant (the uniform distribution)
derivation: to find
we can't simply set the first derivative to zero, but we have to
take into account as a variational constraint that the
must sum up
to 1. Thus we build the function
with the Lagrangian multiplier
and set its total
derivative to zero:
.
Now we can vary all
independently of each other and thus set all coefficients (partial
derivatives) separately to zero:
.
Using
we get
,
which already shows that in the state of maximum entropy all
must be the
same size, say p. More
precisely
and thus
and finally ,
where
is the number
of species (not specimens). The entropy for the uniform distribution
is then:
.
criticism: the slightly "unbiological" derivation above
shouldn't conceal the important point that evenness is just the
entropy of an ideal mixture, but with an upper bound of 1, thus
making it into something very close to a probability. Before delving
into all these technical details we should strongly consider to
stick to the Simpson index which carries almost the same information
and is so much easier to understand.
Chao1
where is the
number of singletons in a sample (the number of species caught only
once) and the number
of doubletons
derivation: Chao, A. (1984) Non-parametric estimation of the
number of classes in a population. Scandinavian Journal of
Statistics, 11, 265–270.
criticism: not yet
Are there Species?
In the 70ies and 80ies of the last century we had a broad
discussion (mostly in the mass media) about the biological species
concept, which left as it's only trace today that this topic is
now "mega-out". But there has been no solution and I don't think
it's a good idea to ignore this point. To me it sometimes appears
as if taxonomy and the species concept is like a black hole buried
deep inside of biology, which - once it becomes clear that there
is no species concept - could lead to the collapse of many fields
of modern biology.
In the most simple (i.e. mathematical) terminology a species is
an equivalence class, which is created by an equivalence relation
R. Such a relation needs just 3 properties: aRa is true
(reflexivity), aRb entails bRa (symmetry) and aRb together with
bRc entails aRc (transitivity). (A simple example for a transitive
relation is "smaller than" or "<": a < b and b < c
entails a < c.) These three properties are sufficient to ensure
that any set on which an equivalence relation is defined, can be
partitioned into disjoint subsets (or classes): in our case the
species.
The problem in taxonomy is: we don't have such an equivalence
relation. What we use instead is the similiarity relation - and
this is not transitive.
As an example let's talk about the most famous species
definition: 2 specimens from a population belong to the same
species if they can produce fertile progeny (- we skip the sex
discussion). Let's imagine the population is somehow ordered (due
to it's gene-pool or whatsoever) in a plane, with the more similar
specimens situated in the middle and the less similar ones farther
outside on the margins. Of course the specimens from the middle
will all belong to the same species. Furthermore let's assume that
specimens from the left margin belong to the same species than the
middle ones, as well as the specimens from the right margin. But
this does not entail, that the specimens from left and right
margin are still similar enough to produce fertile offspring.
More precisely this means that if specimen a belongs to the same
species as H (the holotype) and H belongs to the same species as
b, this does NOT entail that a and b belong to the same species!
And we are not talking about exceptions to a rule (the typical
excuse of biologists), we talk about a concept which is self
contradictory - until we don't find a proper equivalence relation.