The CalculationsW.N. (Willem) Ellis - Werkgroep Vlinderfaunistiek (the Dutch Working Group Lepidoptera Faunistics)
The database = Noctua
For the calculations of flight curves, abundances and related figures, use is made of the data in our database ‘Noctua’. This is a faunistitical database, covering the butterflies and moths of the Netherlands. Grateful use is made also of faunistical data on moths that arrive at the website of www.Waarneming.nl, which are made available to us. Doubtful, or still unvalidated records are flagged as such, and do not play a role in any calculation. Much care is taken to prevent the repeated insertion of duplicate observation in the database.
At the moment of writing (september 2010) Noctua contains somewhat less than 2.6 million observations. Yearly about one hundred thousand records are added, which enables us to monitor in detail the rise and decline of the species. The relevant calculations are described below.
Number of specimens
The number of specimens per species per observation is essential for the estimation of the species’ abundance. When an observation results in a non-numerical estimate (‘many', ‘some’) this is automatically translated into a best fitting numerical estimate. When no numerical information is give at all, 1 specimen is assumed.
A common problem with counts is that a few exceptionally high numbers dominate the overall picture. The usual practice therefore is to use the log value instead: x’ = log(x+1).
Because only a small fraction of the larvae will reach adulthood, in case of pre-imaginal stages the transformation used is x’ = log[(x/100)+1].
A species is either a macrolep, or a micro, or a butterfly. These three categories are not sampled in a fully comparable way. Moreover, most observers feel at ease with only one or two of the categories. Calculations therefore generally are done for each category separately.
Number of Collection Events
A collection event is the fact that at least one lep species was seen or collected by an observer at a day at a locality. The number of collection events, per period and/or per region, is calculated for each category separately.
Abundance, log-abundance and rarity
The abundance of a species essentially is calculated as the number of specimens seen per collection event, in other words as the total of the (transformed) numbers observed (x’1) divided by the total number of collection events (CE). Because CE always is much larger than ∑x’ the result begins with a series of zeroes; to avoid that the result is mutiplied by 10000.
Obviously, a collection event in late autumn is irrelevent for the calculation of abundance of a spring species. In the calculations for each species therefore only the collection events are considered that fall within the species’ flight period.
When calculations regard a small region and/or a short period, often the numerator, CE, is fairly small; small variations in ∑x’ then are blown op excessively. Therefore the formula finally used is: abundance = 10000*∑x’/(CE+100/CE). The addition of 100/CE has no noticable effect when CE is large, but dampens the outliers when it is small.
The resulting abundance number per species varies between below 1 and far above 1000. When several species are to be compared often it is advisable to use log(abundance) instead, or to calculate the performance. Rarity is the inverse of abundance, and, when needed at all, can simply be computed as 1/abundance or better 1/log(abundance).
The larger is the number of years that a species has been recorded in a given plot, the higher is the probability that the species there has a healty population. But in that estimate the more recent years value higher than earlier ones. We can introduce that weighting into the calculation of abundance by giving a weight to each contribution to ∑x’. When a period of years is considered between F(irst year) and L(ast), then the weight for a given year y is w = (y - F + 1)/(L - F + 1). The weighted abundance then is calculated as 10000*∑w*x’/(CE+100/CE).
From aaa to rrr
A comprehensive description of the commonness of a species should be based on the nationwide weighted abundance (t) and the number of grid cells from which the species is known (u). The value of u can be made more precise, because the weighted abundance in each grid cell is known (w); then u’ = ∑w over all grid cells. The two can be combined as z = t*u’. To catch z in words the species of a category are sorted downwards on their z-value. When there are n species, the first n/6 species score ‘aaa’, the second n/6 score ‘aa’, etc.
aaa = very common, aa = common, a = not so common, z = quite rare, zz = rare, zzz = very rare.
A series of a species’ yearly abundance values can be standardised by subtracting from each value the average, and dividing the result by the standard deviation: p = (x-av)/sd. The resulting values, called the species’ annual performance, are comparable between species, even if they differ in their overall abundance.
In a similar way the abundance values for a number of regions can be made comparable between species; the transformed values are called the preference.
Calculation of the abundance, as described above, is impossible when most data stem from publications or collections. This is because no collection events can be reconstructed from the data. Moreover almost invariably the more common species are severely underrepresented. 'Presence' offers an alternative, although imperfect, measure.
The calculation starts with a table: species vertically, years (regions, etc) horizontally. The total number of specimens for a species (row) is called r, the total number for a year (column) is k; the grand total is N.
The proportion of a species in the total fauna is r/N. One might expect then that its number in a given column k will be r*k/N. This expected value can be compared with the observed value by calculating log(obs/exp), what algebraically equals log((obs*N)/(r*k)). Because obs often will be zero and the logarithm then fails, a small adaptation is needed: log(((obs+1)*N)/(r*k)).
Trend, change per period
A measure for the degree in which a species declines or progresses is given by the trend, obtained as the linear regression coeffcient. More often the predicted values for the first and last year of a period are calculated based on the regression, and the change expressed as a percentage of the start value.
The long-term perspective a of species, when conditions remain unchanged, depends on its abundance and the trend - the same declining trend is more threatening for a rare species than for a common one. These two elements can simply be combined as: perspective = 2*log(abundance) + 100*trend.
Biodiversity of an area
As the contribution to the the biodiversity of an area by a given species, the value b = 1 + log (u/n) is taken. Here u is the abundance of the species in the area, and n its abundance nation-wide (both in the period 1980-present). The biodiversity of the area then is ∑b.
Laatste wijziging: 18 september 2013