Page tree
Skip to end of metadata
Go to start of metadata

If you have any comments, questions or feedback about these options, please contact researchdatasupport@csiro.au.


These are all the places that you may think about storing your research data. They are not all suitable or recommended. Scroll over for the full table

(tick) = Yes, (error) = No, (question) = Maybe.

 

Storage location

Recommended for working Research dataRecommended for long term archivingSize limitsAccess limitsData Discoverable?Available computing facilitiesLinked to HPC?Data protection
Old versions availableFurther information and notes
 Local hard drive

(error)

(error)Hardware dependentOnly available to users with access to the PCOnly via folder names and structures, and file names.Desktop or laptop(error)weak Only if backups managed personally

Data could be susceptible to loss without regular backups.

See Rule 9 in Ten Simple Rules for Digital Data Storage

External local drive or storage media (USB, Hard drive, CD, etc)

(error)


(error)Hardware dependentOnly available to who can access the driveOnly via folder names and structures, and file names. (error)weak Only if backups managed personally

Data susceptible to loss.

See Rule 9 in Ten Simple Rules for Digital Data Storage

Physical project server

(error)

(error)Hardware dependentOnly available to who can access the server  (error)weak /moderate Only if backups managed personally

Would not be able to be connected to IM&T networks.

See Rule 9 in Ten Simple Rules for Digital Data Storage

CSIRO IMT Services

Home drives

(question)

For single person projects

(error)May be subject to quotasMay be available to CSIRO individuals on owners request

Only via folder names and structures, and file names.

Not discoverable to others in the organisation.

 (error)strong

(tick)


 See Storing and Managing Data
Project Shared drives(tick)(error)NoInternal access only

Only via folder names and structures, and file names.

Not discoverable to researchers outside the access group.

 Possible for Windows HPCstrong

(tick)



See Storing and Managing Data
Bowen Research Cloud (managed project storage)(tick)(error)NoInternal access onlyWith appropriate software layerRuby, Pearcey and Bracewell. Only data in Canberra can be used directly.(tick)  strong (tick)  (mostly)Bowen Research Cloud
IMT Enterprise Virtual servers (VM)(tick)(question)NoDepends on the service running on the VMDepends on the service running on the VM 
moderate / strong

On request

Will likely need own backup protocols implemented

IMT Hosting

Used for running services.
Not recommended for storing data in the long term.

SharePoint

(question)

File size limitations

(error)

Users can request 0.5G, 1G, 2G or 5G.

Not suitable for files greater than 250MB.

File version control.

Can be used to collaborate with external partnersSearchable depending on accessWould need to move data in and out of SharePointNot directlystrong (tick)  SharePoint
Confluence

(question)

File size limitations

(error)


Attachment size limit 100MB.

No size limit per space.

File version control.

Can be used to collaborate with external partners

Searchable depending on access

Would need to move data in and out of Confluence(error)strong (tick)  Confluence
Data Access Portal

(error)

Not recommended to store working data but could be used to create metadata record that points to your working data store to make it discoverable

(tick)

Collection size limit 1TB.

No limits per project or per user.

Data can be shared internally, publicly or with specified users (including external)Via collection level metadata Can be done strong (tick)  

Data Access Portal Users Guide

Data Access Portal (DAP) and research data support

Scientific Computing data store(tick)  

(question)

Longest-serving large data store in CSIRO, since 1991.

No size limits.

Default limit of 150,000 files.

See: quota limits

All registered SC users

Primarily via POSIX filesystem, with folder name and structures, and file names.

SAMBA (Windows).

Web access available.

Ruby (direct-access), Pearcey and
Bracewell
(tick) strong

(tick)  

File Systems on the Datastore

CSIRO SC Data Store - Ruby

SC filesystem conventions

ePublish(error)(error)File size limits 100MBNot accessible except to corresponding author, approvers. ePublish reporters, and publication officersOnly searchable by associated manuscript metadata and by publication officers(error) (error)
strong (tick)  

Should not be used for data.

What goes into ePublish?

Scientific Computing scratch (/flush*) file systems

(tick)

Large capacity and high capability - but copy important data to a safer place as soon as practical.

(error)Large quotas on inodes and spaceAll registered SC usersAccess via POSIX filesystem, with folder name and structures, and file names.Ruby, Pearcey and Bracewell(tick)

moderate

Hardware protection only

(error)Areas are subject to flushing, which removes old files.

Partner Services

NCI

Raijin

(tick) 

(tick)

Only in some areas

Depends on data type and filesystem


Can be public or only available to NCI usersFor some areas

raijin

tenjin cloud

(tick)

moderate / strong

Varying with area 

NCI backups depend on data type.

See NCI Filesystems User Guide

NCI National Facilities

http://nci.org.au/services/data-management-storage/

http://nci.org.au/systems-services/data-storage/

https://opus.nci.org.au/display/Help/Filesystems+User+Guide

AARNET CloudStor

(tick)


(question)

Your files will be accessible for as long as your institution login is valid

Two services: FileSender and Storage.

FileSender used to send large files (encrypted).

Default storage allocation 1TB. Can sync to local disk.


Users access CloudStor using their institutional account, through the Australian Access Federation (AAF).

  (error)

moderate / strong

The AARNET CloudStor service does not come with any security documentation or rating. However, the login process is conducted using the AAF and the data transfer is encrypted using Secure Socket Layer (SSL). 

(error)AARNet CloudStor

Other External Services

External Consumer (Personal) cloud storage e.g. Dropbox

(question)

Considerations and risks of using public cloud services

(error)Depends on service and plan. May be a cost. For example Dropbox has 2GB for free. 1TB for $13AU per monthDepends on service   

moderate / strong with caveats

May not be to CSIRO requirements

See CSIRO Information Security Procedure

OftenPublic Cloud (IMT)
External Cloud infrastructure and platform services e.g. AWS

(question)

Considerations and risks of using public cloud services

(question)

Considerations and risks of using public cloud services

     moderate / strong with caveatsMaybe 
External disciplinary and other certified trusted repositories(error)(tick) with recorded approval from Rank 4 delegateDepends on repositoryUsually publicYesNANot to CSIRO facilities. May have linkages to cloud facilities.

moderate / strong with caveats

May not be to CSIRO requirements

Depends on repositorySee The PLOS One list of recommended repositories for examples.

Column Definitions

Size limits

May come from policy/quota restrictions and/or physical or allocation size restrictions.  Limits can often be raised if you have a justified need and there may be associated costs. File/object number limits may also apply.

Access limits

Firewall and permission/control considerations.

Data Discoverable?

A rough hierarchy of data discoverability might be:

  • on a private storage space with undocumented structure and little or no embedded metadata - hard to discover
  • through to...
  • on managed and accessible storage in a defined structure with embedded standards conforming metadata with indexing such that it is searchable

Data accessible though services will tend to be more discoverable.

Linked to HPC?

Can the data be used in place in a High Performance Computing workflow?

Data Protection

Are systems in place that guard from single points of failure? 

Access control is assumed, but is usually weakest for portable devices.

Old versions available

Is old content preserved if it is replaced by new content, offering some protection from mistakes?

Is a history of changes kept?

Recommended for working Research data

Working research data can potentially be recreated from source data by repeating a workflow.

Recommended for long term archiving
e.g. reference or published data

Data that would be prohibitively expensive or impossible to reproduce. Take into account

  • the availaiblity of persistent links especially persistent identifiers
  • preservation level storage
  • discoverability in the long term i.e. the use of metadata and a search.

The only recommendation in CSIRO is the DAP for this category.

3 Comments

  1. Under 'Scientific Computing data store' in the final column 'Further information and notes', we could add.

    Information about the latest backup of the data store and other user systems can be found at 

    https://hpc.csiro.au/users/backupstatus/  - link text 'User Filesystem Backup Status'.

    Thanks

    Rob.


  2. Add in, which storage option can be locked down, by locked down do you mean restricted use Cook, Sue (IM&T, Kensington WA)?