CSIRO Data Access Portal News

Last modified by Cook, Sue (IM&T, Kensington WA) on Aug 01, 2017

Subscribe via RSS Or CSIRO Users can "Watch" this page

The CSIRO Data Access Portal will now issue a DOI for restricted data and software collections.

A recent change to the DAP has enabled DOIs (Digital Object Identifiers) to be issued for data and software collections where the associated files cannot be yet be made public, so long as the metadata about the collection can be made public.

In the past both the metadata and the files needed to be publicly available.

This change will enable a researcher to have a DOI when:

  • a researcher wants to publish the data or software in association with a journal article, and needs a DOI to include the citation for published data/software in their reference list, but does not wish to make the files available before the article is published.
  • ethical considerations mean that the data can only be available to selected users.
  • the data has other sensitivity concerns.

Embargos can be automatically applied to make the data public at a later date without changing the DOI.

Use Ask a Librarian or Research Data Support for assistance using the CSIRO Data Access Portal

The CSIRO DAP has been recently added to the list of recommended repositories by PLOS. We were able to demonstrate that the DAP adheres "to best practices pertaining to responsible data sharing, sustainable digital preservation, proper citation, and openness ..."

To comply with the PLOS data policy authors must select the CC-BY licence when creating their DAP record and depositing their data.

The CSIRO Data Access Portal (DAP) is an institutional data repository supporting the publication of data and software for its organisation, CSIRO, Australia’s national science agency. The aim of the CSIRO Data Access Portal is to provide reliable, long-term access to managed digital resources for CSIRO.

It complies with the Force 11 Data Citation Principles.

An attribution statement and a DOI are assigned to all data deposited and made publically available in the CSIRO Data Access Portal. DOIs are minted via the Australian National Data Service (ANDS) and DataCite. The CSIRO DAP complies with ANDS service policies.

Data is stored persistently within the repository with full metadata including unique identifiers and a licence. It is mirrored in at least two separate datacentres by default. Data and metadata are versioned.

See our entry on re3data.

A new version of the DAP (V2.17.1151) was released on 16 March 2017.

Highlights of this release for users are:

  • A clearer option for downloading all files in a collection on the files tab.
  • Downloads now include the collection metadata and licence information.
  • Improvements for viewing and selecting files.
  • Depositors uploading via STFP will notice that upload speeds have been improved at least four fold.
  • Depositors now get a warning if publishing a collection with no files uploaded.
  • Improvements to the DAP API allow authenticated users to create collections.

For further information see Release History.

This release was part of the DMCEP project

For more information about the program including details on the program's outcomes and deliverables please visit the DMCEP Wiki.

The DMCEP program is highly Researcher driven regarding what new capabilities are developed and when and feedback and improvement suggestions are welcome.

Please contact the RDS team via researchdatasupport@csiro.au :

  • If you would like a demonstration or further information about any of these changes
  • If you have a particular use case that you think would be suitable for extending our data management capabilities.
  • Or if you have any other comment or feedback to make.

A new version of the DAP (v2.16.1063) was released on 15 December 2016. Some highlights of this release are:

    • Return the user to the same page after successfully logs into DAP.
    • Improved search results refine and filter options.
    • Additional RESTful Web Services for creating and updating Collection.
    • Enable file search and paginated file loading on Provide Your Data page.
    • Provide map view on Describe Your Data page.
    • 'Rejected' option in Approver Decision options removed.
    • Validate new Non-CSIRO authors to reduce duplication on Create Your Citation page.
    • General infrastructure improvements and bug resolutions including futher stages to the implemetation of MongoDB and Elasticsearch

For further information see Release History.

This release was part of the DMCEP project

For more information about the program including details on the program's outcomes and deliverables please visit the DMCEP Wiki.

The DMCEP program is highly Researcher driven regarding what new capabilities are developed and when and feedback and improvement suggestions are welcome.

Please contact the RDS team via researchdatasupport@csiro.au :

  • If you would like a demonstration or further information about any of these changes
  • If you have a particular use case that you think would be suitable for extending our data management capabilities.
  • Or you have any other comment or feedback to make.

Anusuriya Devaraju is conducting a short survey (https://www.surveymonkey.com/r/NP7MPSV) to identify the important metadata elements when searching datasets on the CSIRO Data Access Portal (https://data.csiro.au/). The survey results will be used to support her research on developing a recommender system of research datasets.

  • The survey includes 2 questions, and should only take about 5 minutes of your time.
  • Participation in this survey is completely voluntary.
  • Your responses will remain strictly confidential.

Thank you in advance for taking the time to complete this survey. Please forward this post to potential users of the DAP

New CSIRO Data Access Portal version 2.13 released

A new version of the DAP (v2.13.754) was released on 29 April 2016. Highlights of this release:

  • Deposit now includes a Collection Type field, with options Data or Software; the selected type is included in the published Attribution Statement
  • Where a collection contains more than 25 files, the file list is initially displayed collapsed and a Search for Files option allows filtering/finding within list
  • Web services API enhancements:
    • Can request a specific version of a collection
    • Retrieving metadata for a collection includes more fields in response
    • New endpoint /collections/{id}/versions to list available versions of a collection
  • In Deposit Data Collection 'Citation' tab, contributors can be reordered by dragging and dropping
  • Department of Education and Training (Australia) added as a funding source
  • Bug fixes include:
    • Special/extended characters now encoded correctly in web services API XML responses
    • Team and Business Unit fields now saved as part of draft collections
Data Access Portal - scheduled outage 28 April 2016

The CSIRO Data Access Portal and the CSIRO ASKAP Science Data Archive (CASDA) will be unavailable Thursday 28 April 2016 10:30-17:30 AEST to release new versions of these services.

IMT formed the Data Management Capability Enhancement Program (DMCEP) to further develop the DAP and related data management services across CSIRO and build an integrated data management ecosystem servicing the needs of CSIRO and the Australian Innovation System.

To date the DMCEP has delivered 6 development sprints and 4 production releases including approximately 100 enhancements (new functionality and bug fixes) to the DAP.

These enhancements have included:

Changes for depositors:

    • Removed the need for further approval for metadata edits to records already published. This means that any edits that depositors have been wanting to make to records, for example, incorrect spatial coordinates, or typos, can now be done without an addtional approval step. Any changes to the data will still require re-approval.
    • Changes to the attribution statement to make it easier to reuse
      • all contributors are now listed
      • contributor name order has been changed to last name, first name
      • the DOI is a full link
    • CSIRO Groups can be used as Contacts
    • Corporate Names can be included in the Contributors list
    • Related materials 
      • can now include a relevant Attribution Statement
      • now have a type: website, publication, collection
      • drag and drop to reorder
    • New section added for tracking Funding Sources
    • Added search for Field of Research (FOR) codes
    • Co-ordinates entry section in Location details now a spatial representation
    • Start and end dates can now be a year, or a month and a year
    • Files display in order on the data tab
    • Image metadata is no longer mandatory
    • The Collaborating Organisations element and Rights Statement are co-located and it can now be seen how they relate
      • Collaborating Organisations can be reordered using drag and drop
    • Project Leader is now auto-populated
    • Can add documentation for approvers-eyes only
    • A record submitted to an approver can be changed to another approver by the depositor
    • Shareable persistent link is now on the description tab and the data tab

Changes for Approvers:

    • Reminder emails are now sent monthly
    • Removed mandatory Data Deposit Checklist question
    • A record submitted to an approver can be changed to another approver by the depositor
    • Approval process for metadata only updates are not required
    • Can receive documentation from depositors

Changes for Data Users:

    • Improved interface for requesting large collections
    • Shareable persistent link now on the description tab and the data tab

Large collection changes:

    • Increased the default large collection mount time from 48 Hours to 1-2 weeks
    • Large collections can be now mounted for extended periods on request to RDS

Web services enhancements:

Other:

    • Improvements to the feed to ANDS Research Data Australia
    • Better support for notification for outages etc for example a banner on the header of the DAP which will be used for notifications

CASDA:

CASDA version 1.1 has now been released.

The enhancements in version 1.1 over version 1.0 (released Nov 2015) are:

  • Scripted access to large data files via Virtual Observatory (VO) protocols, including authenticated access
  • 3-d image cube cut-outs including spatial, spectral and polarisation filtering
  • Example script for producing bulk image cut-outs based on a catalogue
  • Team member access to unreleased data products via VO protocols
  • Support for direct transfers within Pawsey Supercomputing Centre for users with Pawsey accounts
  • Administration of project roles (e.g. allocation of validation rights for a project)

More to come ...

We have also been working on analyses and precursor work for:

  • more improvements to large data collection capabilities
  • enabling linking to data hosted externally
  • support for file/object level metadata (investigating implementation of MongoDB)
  • development of a Data Management Plan online tool
  • better indexing to support improved discovery, and discovery via web services (investigating moving from Solr to Elasticsearch)
  • better reliability and stability monitoring
  • more enhancements to web services
  • better statistical reports for depositors
  • Provenance capabilities.
  • Semantic capabilities.

More information and getting involved

For more information about the program including details on the program's outcomes, deliverables please visit the DMCEP Wiki.

The DMCEP program is highly Researcher driven regarding what new capabilities are developed and when and feedback and improvement suggestions are welcome.

Please contact the RDS team via researchdatasupport@csiro.au :

  • If you would like a demonstration or further information about any of these changes
  • If you have a particular use case that you think would be suitable for extending our data management capabilities.
  • Or you have any other comment or feedback to make.

CSIRO Astronomy and Space Science has just passed a major milestone in data archiving. The total number of Parkes pulsar projects published in the CSIRO Data Access Portal has just passed the 100 mark - there are now 104 projects with ~405,000 files spanning 23 years (1991 to 2014). Those ~405,000 files represent ~190TB of data.

There is a collection for each semester of a project so each project has more than one associated data collection. There are currently 569 Parkes Pulsar collections publicly accessible.

The CSIRO Data Access Portal will be unavailable Wednesday 27 May 2015 10:00-20:00 AEST to release a new version, v2.9, of the Data Access Portal and Dataset Metadata Collector (an auxiliary application of the DAP).

The new version of the DAP v2.9, released on 23 April 2015, includes the following new features. These new features will support data depositors from the CSIRO marine community and support the deposit of data collected on the voyages of the Marine National Facility vessel, the RV Investigator.

  • Link metadata to pre-loaded data:   Marine End of Voyage (EoV) data will be loaded into pre-defined storage following a voyage end.  The data depositor is then able to create a metadata record in DAP and link it to the pre-loaded end of voyage data.  The XML file name containing EOV is used to link the DAP record to the pre-loaded data.
  • Tagging records:The previous release allowed new records to be tagged for harvesting by external organisations.  A bulk process has now tagged pre-existing records that contain public metadata for Research Data Australia and TERN Soils
    • Tag added for MARLIN so that Marine Community Profile (MCP) records can be harvested.
  • Marine Community Profile (MCP):Data depositor is able to use the self-service deposit to create a Collection which uses the Marine Community Profile schema.
    • Depositor is able to select Marine Community Profile of ISO19115 (MCP) as an option in the “More about this Collection” section of the deposit screen enabling them to construct a valid MCP metadata record.
    • MCP metadata available via web services for individual collections.
  • Upload XML Metadata:  Extension of current ANZLIC XML upload functionality to allow a depositor to upload MCP compliant XML files to create a new draft collection. Successfully uploading a MCP XML file creates a new draft record with relevant fields populated from the XML file, including MCP specific fields in the "More about this Collection" section.
  • My defaults: Users are able to use the National Facility default value as defined in their My Defaults section while uploading XML metadata to pre-populate the Collection fields.
  • Collection file number limitations:  Support is now available for Collections with up to 100,000 files.  This is an increase from 30,000 files.

See Release History for a complete list of all DAP releases.

The CSIRO Data Access Portal will be unavailable Thursday 23 April 2051 09:00-17:00 AEST to release a new version of the Data Access Portal, v2.9, with enhancements and bug fixes.

In response to demand from researchers and with a growing global trend to make science software citable, CSIRO's Data Access Portal (DAP) now enables users to find software and code published by CSIRO researchers.

Software released publically via the DAP is assigned a Digital Object Identifier (DOI), improving the citability of CSIRO’s software. DOIs are issued via the Australian National Data Service (ANDS) Cite My Data service through the DataCite consortium.

The CSIRO DAP already has features that support data publication and citation; these features now also support software publication and citation:

  • collection record versioning
  • access control
  • attribution statements
  • built-in approval workflows
  • enabling related links to associated publications, other software and data
  • persistent links and DOIs.

Recent changes to the DAP ensure that publically released software:

  • is associated with pre-approved software or code licences
  • can show dependencies such as programming language and operating system in the metadata
  • is categorised as software, and can be filtered in search results.

A key development feature has been the CSIRO wide collaboration, between IT, Library Services and Legal  to develop the template software and code licences, and a licence selection workflow. Also integral to the project were the researchers from the Workspace software development team who contributed the researcher perspective and published their software in the DAP.

Alex Whan, a researcher from CSIRO Agriculture used the DAP to publish his GrainScan software and says "Having a means to publish software with appropriate metadata and a DOI means it is straightforward for other researchers to access and cite our work. It also ensures specific versions of software can be stored and referenced in a maintainable way. With the DAP fitting into standard CSIRO approval processes, getting the software published can be straightforward and transparent".

For more information contact researchdatasupport@csiro.au

Comments

    Add new comment