Archive in Movebank

Movebank offers tools for working with bio-logging data from the start of data collection until publication and beyond. As part of our data policy, we strongly encourage data owners to eventually make their data public, or else assign a long-term contact person to address future sharing requests. When it comes time to share a study with others or prepare it for archiving, it is important that the study contains enough information for others to understand the data and use it appropriately. The following guidelines describe how to make sure your study is correctly and thoroughly organized, and covers common archiving strategies. For questions or assistance with archiving data through Movebank, contact support@movebank.org.

Data owners can choose whether and with whom to share their data stored in Movebank. Those who use others' data from Movebank must agree to the general Movebank terms of use and user agreement. Movebank's Permissions give the technical options for sharing with specific users or the public. If you create studies that you do not intend to share or archive beyond the end of your project, we recommend deleting your study when you are done. To use Movebank for temporary purposes without intention to archive, please set the Study Type to "Test".

Below we offer tips for the following scenarios:

Share my data during collection and analysis
Make my study public and prepared for re-use
Formally archive my study with a data publication and DOI
Make my private study discoverable and prepared for re-use

We also provide guidance for common questions about what to archive.

General best practice tips

The user manual provides detailed instructions to organize your data in studies. The following additional steps are recommended to ensure sure that your studies can be discovered and understood when shared with others, whether publicly or privately.

Make your study name and summary public. Studies intended for long-term storage on Movebank should have the Study Type set as Research with the study name and summary visible to the public. This step allows others to view the written summary in your Study Details and contact you for more information.
Maintain access to your study. Add at least one co-owner or colleague as a Data Manager for your study that could take over as the study contact if needed. Be sure to keep the email address associated with your Movebank account up to date and transfer ownership of studies as needed.
Complete your Study Details. Including relevant details and keywords will help people discover your work. Provide an informative title and study summary, and list people and organizations leading the work. We recommend adding citations for published work describing the study using CSE style. Please provide a description in English, and we welcome information in additional languages. Consider that readers might include scientists and non-scientists.
Check your imported data. Complete quality control to make sure your data and reference data are imported and organized correctly.
Provide sufficient reference data. While Movebank has few mandatory requirements for reference data, please add as much information as you’re able to: missing details, such as the distinction between adults and juveniles, can restrict or misinform future analyses using the data. Of course, what is “sufficient” will depend on what is available, for example what information was collected in the field, the equipment and methods used, and the original purpose of the study. Keep in mind that new analyses might make valuable use of information you collected that wasn't important to the original study. Download the reference data from the study by selecting Download > Download reference data and make sure it looks complete and correct. See these instructions to make updates.

The following is a minimum suggested set of reference data attributes:
animal ID: always include
animal taxon: always include
tag ID: always include
deployment ID: always include
deploy-on date and deploy-off date: include if known, required if pre- or post-deployment locations are present
animal life stage: include if known
animal sex: include if known
attachment type: include if known
deploy-on latitude and deploy-on longitude: recommended especially when using methods that result in no or low-accuracy location estimates
deployment end type: include if known
manipulation type: always include, to clarify whether data represent "natural" or experimentally manipulated animal movements
study site: recommended especially when there are multiple groups of animals or deployments (e.g. deployment locations)
tag manufacturer name: include if known
tag model: include if known
tag readout method: include if known

Many other attributes are available. See these templates for specific data types, and a complete list of terms and definitions in the Movebank Attribute Dictionary.

Want help? For help adding or checking your data, send an email to support@movebank.org.

Share ongoing studies

If you are collecting data through live feeds or over multiple field seasons, you may want to share it with the public or with specific people while data collection and analysis is still ongoing. Use your study's Permissions settings to add select users as Collaborators, or use a rolling embargo to share only older data with the public.

Review the general tips above. As new data come in, you’ll need to periodically review the study—consider setting up calendar reminders to do this or add it to your post-fieldwork protocol. Once you’re done with the study, be sure to see the options below for long-term archiving and publication to make sure your work isn’t lost. It is also possible to publish the current version of ongoing studies in the Movebank Data Repository as described below.

One vs multiple studies

For projects involving multiple species, funders, field seasons, or deployment sites, you will need to decide if and how to divide your data across studies in Movebank. Movebank's data model treats studies independently, and there is no automated way to split or merge studies. Things to consider:

How will you share the data? Data sharing is done at the study level, so if you will be sharing different subsets of data with different users, it is easiest to create a separate study for each user group you are sharing with.
How much data do you have? Users are successfully managing Movebank studies with over a thousand animals and hundreds of millions of data records. However, if you are working with hundreds of deployments or many millions of data records, multiple studies might be easier to manage. Conversely, making one study for each individual animal is an inefficient use of your time and makes it more difficult to view and compare the tracks and to apply filter, annotations, and analyses consistently.
How much technical expertise and support do you have? Movebank's REST API allows advanced users to monitor and collect data in Movebank and send it between databases or applications. If you will not depend on Movebank to view or share data, this can enable large data volumes or more complicated sharing of data within a single study.
Will multiple studies overlap or cause redundancy? Subsetting data into multiple studies can be ideal when the studies serve different purposes (eg different sharing settings, different filters). But if an old study has been made obsolete (for instance, the data were reorganized into a newer study to correct an import problem) it should be deleted. Overlap or duplication of studies should be carefully noted in the Study Details for clarity and posterity.

Public sharing

We strongly encourage all data owners to eventually make their data public, unless there are legal or conservation-related reasons for restricting access. Review the general tips above. When you are ready, you can allow public sharing in the Permissions for your study. The Permissions settings include data licenses and embargo options that give you control over when the public may download data. To get a DOI, persistent link and citation for your study, you can submit it to the Movebank Data Repository as described below.

Formal public archival

The best way to assure that others will be able to discover, access and use your data far into the future is to publish it in the Movebank Data Repository. This also provides you with a DOI, license and a citation for the dataset.

Review the general tips above to prepare your study for submission. Alternatively, we can import data and prepare a study for you. For more information and to begin the submission process, read the submission guidelines and fill out the Depositor Agreement.

Controlled access archival

Research indicates that sharing with individuals by request is an unreliable way to ensure access to data (for example see Couture et al. 2018). However, it can be a good option if data collection is ongoing, or if making the data public could pose a threat to sensitive species. You can use Movebank to make your project publicly discoverable without showing any individual locations or allowing data download, and to easily make your data available to individual collaborators if and when you agree on the terms of a specific use. We encourage owners to use Movebank's new embargo options, which allow Data Managers to restrict access to more recent data or commit to making the data public in the future.

Review the general tips above to prepare your study for sharing with others. Be aware that Movebank cannot provide access to non-public data without the permission of a Data Manager for a study, and thus cannot guarantee the long-term accessibility of controlled-access studies. To discuss more formal arrangements for releasing data through Movebank or assigning Movebank as a custodian to address data-sharing requests, contact support@movebank.org.

Special cases

Below are discussion and guidance covering some common situations.

What should I archive?

When preparing to add data to Movebank, or when planning to submit data for formal public archiving as described above, it is sometimes unclear what scope or processing level of data should be included. Movebank is flexible in what, how and why owners store in Movebank, and we offer the following general suggestions.

Just the subset used for an analysis, or the entire dataset?

When different subsets of the same dataset are used in multiple papers, our general recommendation is to publish the entire dataset and refer to that each time, describing as needed in the data and papers which subsets were used for a given analysis. Advantages to publishing one larger dataset are

more opportunities for data re-use for a wider variety of purposes,
reduced chance of mistakes in how data are re-used,
reduced time needed to prepare one dataset, and
authors will likely get higher citation rates for one comprehensive dataset.

We recognize this differs from common journal policies, which focus on ensuring that a specific published analysis is replicable. In our experience, however, there is a much greater demand to re-use data for different purposes than to replicate existing results. When multiple overlapping subsets from one original dataset are published, it can be difficult or impossible for a user to put those datasets back together to accurately recreate the complete dataset. For users combining datasets for larger meta-analyses, the chance of misunderstanding or failure to notice overlapping data increases. To meet journal needs, authors can enable replication using a subset of a larger study with sufficient methods and reference data that define which subset was analyzed.

On the other hand, we realize that it is not always possible to publish an entire dataset at once, such as for ongoing long-term studies. In these cases we would recommend publishing in segments (e.g. one dataset for years 2010–2014 and one for years 2015–2019) or publishing a complete update that can supercede the existing dataset (e.g. one dataset for years 2010–2014 and an update with years 2010–2019). For studies published in the Movebank Data Repository, we can also offer a one-year embargo for published datasets. If formal archiving (including a DOI) is not needed, consider providing limited public access using an embargo.

Modeled or interpolated vs original datasets?

Some datasets consist of an "original" version and one or more other versions that have been processed to improve location estimates or to enable a specific analysis. Similar to the previous question, we generally recommend archiving data based on their most "original" version (after decoding and importing to Movebank). Potential data users can replicate processing or modelling steps with the original data, and with the original data they also have the opportunity to analyze in different ways. Other things to consider:

You can store both original and processed versions of the data on Movebank; best practice is to identify interpolated or modeled event records using "modelled" = true and use the same tag, animal, and deployment IDs across studies if multiple studies are used.
You can identify and flag outliers directly in Movebank, either manually or using filters, allowing publication of all data collected while clearly identifying records that should be ignored.
A processed or reduced version of data for sensitive or threatened populations or species could allow public archiving without posing a conservation threat.
For low-accuracy data, a filtered or processed version of the dataset could make the data accessible for a wider group of users, for example conservation groups that want to aggregate information about migration corridors but don't have the capacity to redo data processing. (For geolocator data see the next item below.)

What to include when archiving geolocator studies?

The developers of the light-level analysis packages TwGeos, GeoLight, probGLS, SGAT and FlightR and the TAGS platform have defined guidelines for archiving geolocator datasets and worked with Movebank to support this archiving. In order to document existing analyses while allowing for potential future reanalysis using improved methods, they recommend storing raw light-level data, twilight selections, location estimates, and comprehensive deployment information, with the option to store additional files with relevant scripts or protocol. Details are provided in Chapter 9 of the Light-level geolocation analysis manual, and Livoski et al. (2020).

How can I archive sensitive data?

Follow the instructions above for controlled-access sharing. Currently Movebank has no way to guarantee continued access to non-public studies over time, because access in these cases relies on the contact person for the study assessing sharing requests and providing access to others. The Movebank Data Repository is only for public data, because this is the only way we can ensure that data remain accessible beyond the tenure of the data owners. It is possible to publish select portions of a Movebank study in the repository, for example excluding sensitive data for specific individuals, locations or time periods. For sensitive or legally-restricted data, this might not be an appropriate option. We are exploring more formal ways to offer long-term controlled-access archiving and welcome your input.

Rescue older data

A lot of the oldest animal tracking data is still not stored in shared databases. With so many critical ecological questions and policy-relevant issues dependent on knowledge of changes over the past several decades, these data should be archived and made available for appropriate future use. If you are aware of data that are in danger of being lost, for example due to a retirement, the end of a project, or storage on devices and file formats that are becoming obsolete, please reach out to us at support@movebank.org. We can help communicate with data holders and facilitate data conversion and import to Movebank. Data can remain access-controlled if there is a point person who can respond to inquiries.

Archiving best practices

Archive in Movebank

Archive in biodiversity data platforms