Page tree
Skip to end of metadata
Go to start of metadata

An un-conference session saw a lively discussion of the definition of 'sample', exploring if the disciplines represented in the meeting had a shared understanding of the essential characteristics. The concept of 'sampling-feature' from the Observations and Measurements standard (ISO 19156, available as OGC Abstract Specification Topic 20) and Observations Data Model (ODM2) provided a starting point for the definition here.  

Cross-domain definition: 

Thing or subset (real or virtual) which is intended to be representative of a larger thing or set. A sample is created or selected for the purpose of making observations or measurements to determine the value of one or more properties, traits or qualities of the larger thing. The essential property of a sample is the 'is-sample-of' relationship with the larger thing. 

A material sample or specimen is typically selected to allow ex-situ observations. 

A statistical sample or subset is a specified number of individuals from a population. 

A spatio-temporal sample may have the same dimension but a smaller extent than the thing which it represents, or may have a lower topologic dimension than the thing which it represents. 

Samples are often related to other samples, either through sub-sampling, or as members of a sequence of siblings, or through specific treatments that transform the original sample into one or more new samples. Sub-sampling may select specific elements of the parent, or may be unbiased. The nature of the relationship is important, but the complete set of relationships cannot be enumerated - this is an important area of innovation in observational protocols. 

The identity, or even existence, of the larger thing or population may not be known at sampling time. For example, a material sample or specimen may be collected opportunistically 'because it looks interesting'. However, its scientific significance is only realized if it is determined that it is indeed representative of a larger body or population. 

Examples: 

Spatio-temporal samples - lower dimensionality

  • Snapshot of a time-varying phenomenon
  • Point (0-D sample) within a curve (1-D), surface (2-D) or volume (3-D), 
    • e.g. monitoring station, pixel
  • Curve (1-D) within a surface (2-D) or volume (3-D), 
    • e.g. transect, flightline, cruise track, borehole trace
  • Surface (2-D) within a volume (3-D)
    • e.g. cross-section

Spatio-temporal samples - same dimensionality

  • interval within a curve
  • interval within a time-series
  • quadrat within a tract
  • core from a body of rock.

3 Comments

  1. Hobern, Donald (GBIF) wrote (What is a sample? Day 2): 

    This may be pedantry, but I have a feeling that the concept of "specimen" within biological sciences may relate to something which is not necessarily strictly "intended to be representative of a larger thing or set" or at least not as a proxy for a set that is known or recognisable to the researcher at the time of collection.  Some sampling methods gather individual organisms which are managed as specimens prior to any determination of what they may represent.

    So is the issue:

    1. one day we hope to know what the 'bigger thing' is (e.g. species, population) though this is not know at the time of collection - "it is an interesting looking thing that may provide evidence for something else later", or 
    2. "it is just an interesting looking thing" in its own right and is currently not intended to proxy for anything else. 

    Case 1. is no problem - we just defer providing a value for the 'is sample of' property. Also recognising that a sample may proxy for multiple things (possibly in different combinations with other samples) so the cardinality of 'is sample of' is 1..* . 

    For case 2. I would suggest that this individual is not a sample (or specimen) in the sense we are trying to define here. Bissett, Andrew (O&A, Hobart) has made a similar case in an email. To my mind the 'is sample of' relation is essential, even if its target is not yet known. If this does not apply, then it is just a thing, not a sample (or specimen). If you like to refer to everything that you make observations on as a 'specimen' then perhaps the technical definition deviates from vernacular english here. In O&M and SOSA/SSN we call that case a 'feature of interest', where the word 'feature' is being used in the GIS sense - a thing in the world. 

  2. Its a bit convoluted, but I like it.

  3. You could also argue that in case 2 above ('just an interesting thing'), a biologist is likely to consider something living as interesting, whereas a geologist is likely to find some inanimate thing in or from the earth interesting. This the object of the 'isSampleOf' relation could be a very general thing like 'living thing', or 'geologic feature'