Forum Paper |
Corresponding author: Riccardo Guarino ( riccardo.guarino@unipa.it ) Academic editor: Wolfgang Willner
© 2022 Riccardo Guarino, Marina Guccione, François Gillet.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Guarino R, Guccione M, Gillet F (2022) Plant communities, synusiae and the arithmetic of a sustainable classification. Vegetation Classification and Survey 3: 7-13. https://doi.org/10.3897/VCS.60951
|
We propose an equation to evaluate the efficiency of a classification as a function of the effort required and the population size of data collectors. The formula postulates a “classification efficiency coefficient”, which relates not only to the complexity of the object to be classified, but also to the data availability and representativeness. When applied to the classification of phytocoenoses, the equation suggests that a classification system based on vascular plants offers the best compromise between sampling effort, resolution power and data availability. We discuss the possibility of basing a vegetation classification on plot records for all macroscopic photoautotrophic organisms co-occurring in the vertical projection of a given ground area, as recently suggested by some authors. We argue that the inclusion of cryptogams in the description of phytocoenoses dominated by vascular plants should rely on a synusial approach, conceived as complementary to the traditional Braun-Blanquet approach.
Syntaxonomic reference:
classification, holocoenosis, merocoenosis, phytosociology, synusia, vegetation
Classification is one of the most fundamental and characteristic activities of the human mind and underlies all forms of science (
The classification of the biotic communities (or biocoenoses) is based upon the observation that the distribution of living organisms in their environment is not entirely subjected to chance. In most terrestrial ecosystems, vascular plants are the most visible and accessible part of biocoenoses that include, in addition to the primary producers (photoautotrophic organisms of any kind), also consumers, detritivores, decomposers and microbial symbiotic communities, of which we have become more aware in recent times as a result of sequencing techniques (
In the case of vascular plants, the dispersal of seeds may be somewhat random, but germination and seedling establishment are regulated by environmental constraints and the plants come to organize themselves into communities in which relationships of coexistence regulate the species distribution in space (patterns and frequency), in time (phenology and turnover), and in many other aspects of plant life (
Based on the assumption that non-vascular plants can be important structural elements of vegetation,
In principle, this proposal is based on a reasonable assumption. However, the sampling effort of ‘all-inclusive’ phytocoenoses is significantly higher than that of recording vascular plants only and, for the sake of a classification,
The aim of this paper is to propose a mathematical formulation for classification efficiency and to discuss some practical and epistemological consequences when applying the recommendations of
Before reasoning on the methodological consequences of recording the terricolous cryptogam layer when sampling vegetation plots, let’s try to pose the question (i.e. the vegetation classification) in stringent, arithmetical terms. Let’s consider the formula:
(1)
in which: C (n,v) indicates the complexity of the object to be classified, and P (n,v) indicates the detected fraction of that complexity obtained through data sampling.
In operational terms, C (n,v) is a function space whose main vector quantities are: (1) the whole number of species n occurring in a given area, corresponding to the local species pool targeted by the vegetation classification (representing always and in any case a subset of the local biota) and (2) the number of vegetation units v that can be distinguished in a given area and in a given time interval. Note that C (n,v) is generically defined as function space, i.e. a set of functions between n and v.
The generical definition of C (n,v) is rather vague; however, if we assume that the vegetation v of a given area consists of discrete units (community types) formed by different assemblages of the n target species occurring in the same area (species pool), then we can associate to each species Sk (with k = 1, 2, …, n) a simplified version of the phi coefficient (
(2)
in which: vk is the number of vegetation units in which Sk is present in a given area, and v is the total number of vegetation units in the same area.
It follows that 0 = Ck < 1. In particular, Ck= 0 if vk= v, i.e., if the species Sk occurs in all the vegetation units; therefore, its contribution in differentiating the vegetation units is null. The value Ck= 1 is excluded because we assume that the species Sk is part of the species pool of the given area and, as such, it has to be present in at least one of the vegetation units of that area.
Given the definition of Ck we can define the complexity coefficient C (n,v) as the average of all Ck values:
(3)
The variation of C (n,v) will also be 0 = C(n,v) < 1 . In particular, it would be C (n,v) = 0 in the case of an entirely homogeneous vegetation in the given area. Therefore, the complexity coefficient C (n,v) is conceptually similar to a measure of beta diversity.
If we define the vegetation as a sum of vegetation units or community types, ideally the number of plots should be large enough to record each vegetation unit at least once (purposive sampling design).
P (n,v) indicates the detected fraction of complexity, defined by the quantity of data available, i.e., how much the number of sampled plots and species recorded are functional to the classification. The function P (n,v) represents the ‘added value’ of a given sampling effort (in other words, the ‘added value’ produced by the classification in question). Again, if we assume that the vegetation v of a given area consists of discrete units (community types) formed by different assemblages of the n species belonging to the local species pool, we can write for the coefficient P (n,v) an heuristic expression containing: a) the ratio (neff /n), in which neff is the number of species recorded during the sampling effort; b) the ratio (veff /v), in which veff is the number of vegetation units identified.
We can impose the condition that, for neff = n and veff = v, the ratio (1)
is equal to 1. If so, we can write:
(4)
in which rn and rv are weighting factors subject to the following constraint: rn + rv = 1. The weighting factors rn and rv can be used to weight differently the species and the vegetation units identified by the sampling effort. If the vegetation scientists involved are highly skilled in identifying any species belonging to the species pool on which they pretend to base their classification, one can simply set rn = rv = 1/2. Should it be decided to base the vegetation classification on all macroscopic photoautotrophic organisms co-occurring in the vertical projection of a given ground area, the condition rn = rv will occur very rarely. Therefore, by the number of species neff and the number of vegetation units veff identified, P (n,v) depends on the number of plots of a given size sampled in a given time interval in the given area.
As for i, it measures the impact (effectiveness) of a classification effort. This coefficient is directly proportional to the value of P (n,v) and inversely proportional to that of C (n,v). In practice, it indicates whether the classification in question ‘works’ (given the aims and protocols) at the price of a greater or lesser sampling effort. More precisely, it is a coefficient of effectiveness of the plots sampled in a given area. In summary, i can be defined as the ‘classification efficiency coefficient’.
In particular, i will be equal to 1 when P (n,v) = C (n,v); which happens if neff = n and veff = v, that is when the information on the species pool and the vegetation units of a given area obtained by sampling is complete. Also, i will tend to 0 for neff << n and for veff << v; in this case, sampling is essentially ineffective. It should be noted that i will be equal to 0 also if C (n,v) = 0, corresponding to the limit case where the vegetation of the given area is entirely homogeneous and corresponds to a single vegetation unit.
The ‘classification efficiency coefficient’ is highly influenced by the ‘cost’ of each single plot, provided that the identification effort of the species recorded during the survey can be different and not necessarily homogeneous with respect to the general purposes of any classification approach.
Just as any classification effort, materialiter acceptus, can be associated with a certain level of efficiency in the identification effort of the descriptors for the object to be classified, every single vegetation plot can be associated with a cost, corresponding to a fraction of the utility produced by the classification as a whole (precisely, the fraction that manages to classify that plot). Additionally, we can write that:
in which F is the population of vegetation scientists and r is the average number of plot records produced per capita, so that:
Therefore, if we disregard the theoretical possibilities offered by machine-based approaches, such as remote-sensing, spectral fingerprinting, bulk collection by robots and subsequent metabarcoding, the efficiency (and sustainability) of a vegetation classification is inversely proportional to the complexity of the classification target and directly proportional to the size of the population of vegetation scientists multiplied by the average number of plot records produced per capita.
The above-written equations are valid from the global to the local scale, with the only limitation given by the availability of (skilled) vegetation scientists and of species identification tools for the target territory. These two aspects, of course, are of particular importance due to the well-known enormous regional variance on data availability and resource expenditures.
The phytosociological approach to vegetation classification is based on operational units which have a very practical goal, that is to give a reasonably precise name and conceptualization to plant communities which appear, to some extent, discrete to the eyes of phytosociologists (
In principle, the traditional Braun-Blanquet system is based on all photoautotrophic taxa. However, a different weight is attributed to the vegetation layers in the classification and, apart from few exceptions, the bulk of data underlying the phytosociological system focuses on vascular plant species only. Between the two possible extremes, i.e. a taxon-free, physiognomic vegetation classification on the one hand and an omnicomprehensive vegetation classification (i.e., based on all photoautotrophic taxa) on the other hand, the current phytosociological classification system offers perhaps the best compromise between sampling effort, resolution power and data availability.
In the previous section, we introduced the coefficient i as a generic measure of the effectiveness of a classification effort. However, it must be noted that the variables considered do not fully capture the effectiveness and sustainability of any vegetation classification. There are other, somewhat ‘finer’ variables that cannot be treated with the same arithmetic simplicity. Nevertheless, it should be stressed that i does not only depend on the complexity of the object to be classified, but also on the data availability and representativeness. In fact, the great attention currently paid by vegetation scientists to ‘big data’ – both in the current debate and in comprehensive synthesis studies – indicates that no proposal on new data acquisition methods can afford to ignore the ‘big’, represented by previously recorded data.
Vegetation scientists are relatively few, and those dealing with phytosociological vegetation classification are even fewer, and many of these are familiar with vascular plants only. Not only is the number of phytosociologists progressively decreasing, but also the time dedicated to field data collection (
Given the variables involved in our arithmetical definition of a “sustainable” vegetation classification, we will now turn our attention to some practical and epistemological consequences of basing the phytosociological system “as much as possible” on holocoenoses (i.e. on plot records of all macroscopic photoautotrophic organisms co-occurring in the vertical projection of a given ground area).
If we accept the eminently practical purpose of the phytosociological vegetation classification, we should ask ourselves what advantages or disadvantages a more complete, but more time-demanding, sampling approach would have.
As we have seen, the classification itself should be evaluated basing on its efficiency (corresponding to what we defined as i), but also on the skills and size of the population of vegetation scientists who collect the data and produce the classification itself.
If, for the sake of completeness in the data collection and classification, one wanted to extend the investigation to the whole autotrophic component of the local biota, the classification would be based on species that are biologically, physiologically, metabolically and dimensionally different from each other. This raises many questions about the optimal sampling period, the extra-time required for plot sampling, and the availability of data collectors skilled enough to record all macroscopic photo-autotrophic organisms occurring in the plot.
The recently revised version of the International Code of Phytosociological Nomenclature (henceforth: ICPN;
In any classification system, it is a clear advantage to maintain as much as possible the nomenclatural stability and the conceptual delimitation of the classified objects. Should the praxis of recording all macroscopic photoautotrophic organisms in vegetation plots become a stringent rule of the phytosociological classification, there is a serious risk of rejecting many syntaxa as nomina dubia because “only vascular plants have been recorded, but also the species of the moss layer would be needed for proper classification” (
The decision of whether the species of the moss (or lichen) layer are needed for a “proper” classification is further complicated because some vegetation units change their “properties” depending on the substrate they are found on. For instance, let’s consider the vegetation ascribed to the class Polypodietea. Patches of this bryo-pteridophytic vegetation can colonize a boulder in the forest understorey, the bark of ancient trees in the same forest, but also hundreds of square meters of vertical cliffs in fresh and shady gorges and even man-made stonewalls. Should it be considered a synusia (or merocoenosis) when in the forest and a holocoenosis when it occurs on vertical cliffs?
Synusial phytosociologists argue that, from an ecological point of view, the sampling grain (observational scale) for floristic plot records should be logically related to the organismic scale, which is always the case for plant synusiae, while for phytocoenoses the choice of the plot size is usually based on the largest plants (e.g., trees in forests). Breaking with current practice in both Braun-Blanquetian and synusial phytosociology,
Hoping to record all plant species, including cryptogams, in a relatively large area is, in most cases, a pious wish. As a matter of fact, a good knowledge of cryptogams is quite rare among vegetation scientists. As a result, only some well-known and easily identifiable species will be recorded.
The ancient vision of science, dating back to Bacon, is based on the idea that external events or ‘facts’ can be observed in a neutral way and classified to build scientific theories by induction and deduction. This vision was definitively superseded by Immanuel Kant and his successors. According to Karl
Therefore, there cannot be an objective classification of reality, but each researcher interprets reality starting from ideas, categories and mental schemes that are tested and possibly corrected (the famous Popper’s “falsification”) as errors are detected and new tools become available (
One could argue that Popper would have said that phytosociology has no piles at all, that it is not falsifiable and therefore not a good science. The methods in phytosociology are not suitable to “Erklärung”, only to “Verstehen”. This is because whatever classification you make, it will never be a good representation of nature, only of the mind of the researcher, directed by the aims and goals of the classification. The paradigm of phytosociologists is that of the “Spurensucher” (i.e. trace-tracker);
In the second half of the 20th century, the debate on philosophical thought was occupied by the rehabilitation of Aristotle’s “practical philosophy” (to which the “Indizienwissenschaften” belong), pursued above all by the German and Anglo-American schools of Hans-Georg Gadamer, Hannah Arendt and Bernard Williams. Aristotle was the first who outlined, as object of a specific form of knowledge, exactly that praxis which classifications of any kind are primarily concerned with. A fundamental criterion for marking the domain of praxis is that “the principle of actions is in the agent” (Metaphysics VI, section 1025b, transl. by Hugh Tredennick). Any classification belongs to the domain of practical knowledge and, unlike theoretical knowledge, it is not as useful for satisfying theoretical speculations (in an Aristothelic sense). Instead, it is used to satisfy eidetic and poietic needs. In other words: classifications should have a practical goal and it must have a utility for the “agents”, i.e. those who use it. As pointed out by
For the sole purpose of a taxon-based vegetation classification, the most important thing is to collect enough data on species co-occurrences. This means that collecting many species co-occurrences, even if not particularly complete or accurate, is more useful than collecting few extremely accurate and comprehensive ones.
Science will produce useful and essential knowledge only when it classifies objects and makes predictions based on statistically significant datasets, analysed according to adequate protocols. Field data collection should exercise the art of the feasible more than the art of the possible: the adoption of a sampling protocol aimed at recording all macroscopic photoautotrophic organisms co-occurring in the plot would require time and a whole series of tests to essay its pros and cons against a sampling approach chasing higher plot numbers more than plot completeness.
However, recording co-occurrences of all macroscopic photoautotrophic organisms is not only a matter of effort but also of strategy and conventional rules: if enough projects would follow the “comprehensive” sampling approach, sooner or later we would get enough data to better assess the added value of including cryptogams in the classification of phytocoenoses.
In the field, it is a good idea to make an effort to collect the best possible data, given the time, logistical, and resource constrains. The recording of non-vascular taxa can represent important added value in studies on the drivers of a-diversity, as well as species-area curves used to study fine-grain ß-diversity (Löbel et al. 2016;
R.G. conceived the idea and led the writing of this paper, with substantial contributions from both other authors.
We gratefully acknowledge the comments and suggestions from Wolfgang Willner and three anonymous reviewers on the first version of this paper.