Page tree
Skip to end of metadata
Go to start of metadata

How do you know that some data or a service is 'OzNomic'? We need a rating system, based on a clear set of criteria so you know when you've won. A number of Rating systems and maturity models for data publication and sharing have been proposed, some of which build up simply. In the table below we consider a number of criteria for Data Providers with multiple levels within each criterion, recognizing that systems may have different strengths. As more of these are met at a higher level, the data is more 'OzNomic'. 

Alignment with a number of other systems are provided in columns in the table. Of particular interest are the FAIR principles, developed by Force11 and maintained through CODATA. The table below provides more specific metrics (or examples) of how FAIR principles might be met, in order to provide concrete goals for data providers to aim for. This is particularly the case for loadable, useable, and comprehensible, where we have included specific suggestions around formats and technologies, mostly drawn from our experience with geospatial data. All the FAIR principles are covered in the OzNome Data Ratings criteria, and some more besides - so published, updated/maintained, and trusted do not appear to correspond with any principles from the original FAIR. Note also that the OzNome criteria are presented here in a sequence which attempts to build logically through some key top-level concerns, starting with 'is this data private or public'. That means the OzNome sequence does not match F→A→I→R, but when all is said and done, that was chosen mostly to make a cute acronym. 

5*Oznome Data tool allows you to self-assess your data service based on these facets. 

A similar FAIR self-assessment tool is provided by ANDS. 

Key wordDataset must be ...Matching
FAIR Principle

Levels

(not mentioned in FAIR self-assessment tool )

Matching 5-star LOD facet

Matching NEII conformance axisCSIRO DAP DatanetCDF data on THREDDSASRISALA

published

... intended to be accessible to users other than the creator or owner

a. No external access

b. External access, non-web protocol (e.g. physical media distribution)

c. Published via the web

-

-

Discoverycccc

hosted 

... available on the web

A1



-

a. not on web

b. files on web-server

c. repository with web interface

d. web service - local API

e. RESTful web service - OpenAPI/Swagger

f. standard web API (SPARQL, OGC WMS/WFS/WCS/SOS/WPS, ...)

-


Access - geographies

Access - gridded data

c(, d)fc, de

curated

... provided with a commitment that this data will be available long termA2

a. once-off dump, no ongoing commitment

b. best effort

c. institutional repository

d. certified repository

-

-

-

-

Re-use - operationalc?cb

updated, maintained

... part of a regular data collection program or series, with clear maintenance arrangements and update schedule


a. one-time dataset

b. part of data series, occasional/irregular update

c. part of data series with regular updates

-

-

-

-

Re-use - operational ?b

licensed 

... clearly licensed, so that the conditions for re-use are available and unambiguously expressedR1.1

a. no licence

b. licence described in text

c. standard licence (e.g. Creative Commons)

-

Re-use - licensingc?cc

citeable 

... denoted using a stable, persistent identifierF1

a. Not citeable

b. Local identifier (may change)

c. Web identifier (transient URL or query)

d. Persistent web identifier (PURL, DOI, handle, ARK, etc)

-

-

 dccd

described

... described and tagged with formal content metadataR1, F2, F3

a. no metadata

b. text description (abstract) and keywords

c. basic metadata (e.g. Dublin Core)

d. specialized metadata
(e.g. Darwin Core, ISO 19115, scientific data profile of schema.org)

e. rich metadata using (standard) RDF vocabularies
(e.g. DCAT, ADMS, PROV, GeoDCAT, OMV, VoID)

-

-

Discoverydddd

findable 

... indexed in a well known system
(can be general purpose or community specific)
R1, F2, F3

a. not indexed

b. indexed in a local, organizational catalogue

c. metadata harvested or pushed into a community (e.g. Research Data Australia, Re3Data) or jurisdictional catalogue

d. visible in general-purpose indexes (Google, Bing)

e. highly ranked in general-purpose indexes

-

-

Discovery

Access - metadata

c?c, dc

loadable 

... represented using a common or community-endorsed (i.e. standard) format
(pre-requisite: register of data formats) 

N.B. page formats like .doc and .pdf which are aimed at putting pixels on a page for human consumption don't count! Data provided in these formats cannot be loaded by standard data-processing applications.

I1

a. bespoke file format

b. one standard data-format, denoted by a MIME-type
(CSV, JSON, XML, netCDF, etc)

c. choice from multiple standard formats

Access - geographies

Access - gridded data

 bb (c)c

useable 

... structured using a discoverable, community-endorsed (standard?) schema or data model
(pre-requisite: register of data models, schemas, ontologies) 
I2, R1.3

a. implicit schema, not formalized

b. explicit schema, formalized in DDL, XSD, data-package, RDFS/OWL, JSON-Schema or similar

c. community schema, available from a (standard) location

-

Information modeling ccb

comprehensible

... supported with unambiguous definitions for all internal elements
(e.g. column definitions, units of measure), through links to accessible (standard) definitions
(pre-requisite: register of vocabularies of units of measure, quantities, observable-properties) 
I2

a. local field labels

b. field labels linked to text explanations

c. standard labels (e.g. CF Conventions, UCUM units)

d. some field names linked to standard, externally managed vocabularies

e. all field names linked to standard, externally managed vocabularies

-

-

-

-

-


Information modeling cdd

connected, linked 

... linked to other data using external identifiers (e.g. URIs), potentially crawlableI3

a. no links

b. in-bound links from a catalogue or landing page

c. out-bound links to related data

-

  acc

assessable

... accompanied by, or linked to, a data-quality assessment and description of the origin and workflow that produced the dataR1.2

a. No quality or lineage information

b. Lineage statement in text

c. Formal provenance trace (W3C PROV-O or similar)

-

-

-

-

Re-use - operational ?✔ (in metadata)sort of/where available

trusted

... accompanied by, or linked to, information about how the data has been used, by whom, and how many times


a. no information about usage

b. usage statistics available.

c. Clearly endorsed by reputable organization or framework

-

-

-


  ?
  • No labels

3 Comments

  1. I would suggest that assessing how "trusted" a dataset is exclusively by it's use may not always be appropriate as it biases older datasets over newer. Possibly we could add " ... endorsed by a trusted organization or the result of a trusted provenance chain..."

    1. It wouldn't be the only way to assess, but it is a very effective pointer. 

  2. There are many ways to assess trust. Provenance can help with a few like what it's source datadests are or what code was used to make it but ther ar others in addition to th Se and use too like who published it, what publications it is liked to have etc.

    Basically there are lots of ways so to our best bet is to include enough info for people to use their method.
    Forward provenance, if you can get it (how it was used) is better than better than brute download stats.