Skip to main content

Research Data Management: Linking Datasets to Research Papers

Data access statement

Data access statements, also known as data availability statements, are used in publications to describe where data directly supporting the publication can be found and under what conditions they can be accessed.

Data access statements are required for all publications arising from publicly-funded research. They are a requirement of the RCUK Policy on Open Access (section 3.3 (ii)) and are a requirement of many funders' data policies.

The aim of the data access statement is discoverability - the data referenced by the statement do not have to be openly available. The statement can be placed anywhere in your research article, with the Acknowledgements section being the most popular.There are many reasons why access to data should be restricted and if you are unsure about whether you should publish your data openly please contact for advice.

What to include in the data access statement:

  • a persistent identifier for the dataset i.e. a DOI. A DOI can be generated for you by Library Services once you deposit your dataset in Aston Data Explorer. Before depositing your data elsewhere, ensure that they are able to provide you with a DOI.
  • any justifiable legal or ethical reasons why your data cannot be made available

A simple direction to interested parties to contact the author would not normally be considered sufficient to comply with funder policies. 

  • if the data themselves are not openly available, the data access statement should direct users to a permanent record that describes any access constraints or conditions that must be satisfied for access to be granted
  • if you did not collect the research data yourself but instead used existing data obtained from another source, this source should be credited

Examples of data access statements

There is no set format on how to write up a data access statement. Below are some examples which you can use:

“To access the research data supporting this publication, see 

“Access to the research data supporting this publication is restricted; see  for more information”

"Supporting data will be available from  after a 6 month embargo, to allow for commercialisation of research findings."

Data citation

DataCite and the International Association of Scientific, Technical and Medical Publishers (STM) conjunctively released a statement in 2012 regarding the linking and citing of research data, summarised below:

  • Store research validated data in secure reputable Data Archives
  • Allow publications to be linked with datasets through the issuing of persistent identifiers like DOIs
  • Publishers should increase the visibility of these links to ensure maximum access
  • Both parties encourage the principle of reusing data and citing of datasets

Data created through the lifecycle of a research project is what gives purpose to the research. When a paper is published it becomes freely accessible as it is crucial evidence of the output of research, therefore the underlying data behind the publication must also be freely available to the public. Henceforth it is important that the citation function allows recognition for the paper and is also available.

DataCite provides a detailed Metadata Schema for the Publication of Citation of Research Data. There are 6 compulsory factors needed to cite data accurately for identification and retrieval purposes and they are:

  • Identifier - unique string, e.g. DOI; a persistent identifier
  • Creator - main researchers involved – the creator of the dataset
  • Title - a name or title by which a resource is known
  • Publisher - the name of the entity that holds the resource
  • Publication Year - date when data were or will be made public
  • Resource Type

Note: If you enter this information into Aston Data Explorer efficiently, you will already be meeting the requirements set by EPSRC.

DCC have identified some key factors to consider when citing data:

  • Research data to be used in an academic publication must be deposited with a suitable data archive or repository. A persistent identifier or URL should be assigned to the data.
  • When citing a dataset, always use the citation style required by the editor/publisher. Otherwise use a standard citation data style (e.g. DataCite Metadata Schema - above).
  • Include data citations alongside those for textual publications. Some reference management packages now provide support for datasets.
  • Cite datasets at the finest-grained level available that meets your need.
  • If a dataset exists in several versions, be sure to cite the exact version you used.
  • When you publish a paper that cites a dataset, notify the repository that holds the dataset.

    Further details on these can be found at
    How to cite datasets and link to publications.