Skip to main content

General purpose data filters

There are three general purpose filters available in Movebank that Data Managers can use to identify and exclude location location records (for Argos Doppler data, also see the Argos filters).

To apply these filters,

  • Select the study (or an individual animal or tag) from the Studies page.
  • Select Data > Edit from the study menu to open the Event Editor.
  • If asked to select a sensor type, select the one you want to apply the filter to.
  • After the data have loaded, select Filter data > General purpose filter (located above the sensor type and study name).
  • Choose which filter/s you want to use and the filter parameter settings (see the instructions below).
  • Once you have chosen parameter settings, select Run filters to run the filter/s.
  • You may re-run filters as many times as you like, or clear all filtering by selecting Filter data > Undo filtering.
  • Select Save (located below the table) to save the dataset with the applied filter settings.

For each record, filters will be applied in the following order: (1) duplicate filter, (2), value range filter, and (3) speed filter.

Applying these filters will mark some locations in your data set with a “true” value in an Algorithm Marked Outlier attribute, and will not delete or modify any other part of the data set. Note that manually marked outliers will override values in Algorithm Marked Outlier for determining what records are visible on the map, and records marked as Manually Marked Outlier prior to running the filter will be ignored by the filter. Outliers will show up as red Xs on the map in the Event Editor and will no longer appear on the Search Map page. They remain part of the full dataset and can be included in downloaded data.

After running the filters, always be sure manually spot-check the results. This will ensure that the filter accomplished what you intended and did not lead to unexpected exclusion of records you want to keep.

Duplicate filter

Duplicate records can end up in datasets for a variety of reasons, and can cause problems with data analysis. The duplicate filter allows you to exclude records with matching Tag IDs and Timestamps—you can optionally require values to match in additional attributes to be considered a duplicate.

  • In the Filter data window, check the box next to Filter duplicates.

  • To require the filter to look for matches in additional attributes, select each attribute from under Available attributes and click on to add it to the list of Key attributes.

When you run the filter, it will go through records in the order you see them in the Event Editor and retain the first record of each set of duplicates, marking any subsequent duplicate records as "true" in the attribute Algorithm Marked Outlier.

Value range filter

The value range filter allows you to exclude records based on values for any attribute in the dataset. This can be used for a variety of purposes, for example to remove records with latitudes > 90 or < -90, to remove records with HDOP values above a certain threshold, or even to hide values within a bounding box, such as a breeding site.

  • In the Filter data window, check the box next to Filter by value range.

  • For all attributes used in the filter, specify whether to retain or remove null values.
  • Choose whether records must match all or any of the defined ranges.
  • From the dropdown box on the left, choose an attribute that you want to define a range for.
  • Select how you want to define the range that the values must be within to be valid. For numerical attributes, the options will be = (equal to), != (not equal to), > (greater than), < (less than), >= (greater than or equal to), or <= (less than or equal to). For text attributes, options will be = (equal to), != (not equal to). In attributes with a list of possible values, it will include that list of values. Remember the filter will retain records within the ranges and exclude records outside the ranges.
  • In the field on the right, provide the value to define the range.

The following are some examples of how this filter could be used.

Example 1. Exclude locations outside the ranges for coordinates in decimal degrees, as well as records with no latitude or longitude provided.

Example 2. Exclude locations where "location error numerical" is more than 30 meters (keeping any blanks).

Example 3. Exclude locations where "comments" say "bad location, ignore!"

Speed filter

Many outliers in tracking data can be removed by calculating the speed that would be required to go between consecutive locations, and excluding records that, if real, would require that the animal had been traveling at an unrealistically high speed. The speed filter lets you flag records that imply unrealistic speeds, with an optional buffer to account for the accuracy of your location estimates. Remember the default values are not based on information about your analysis objectives or species. Modify these as needed based on your expert knowledge.

Before running the speed filter, be sure that your deployment periods are correctly set. Accidentally including locations in the track that were collected before or after the tag was deployed on the animal could lead to unwanted results. See more about Defining deployments and visible data points.

  • In the Filter data window, check the box next to Filter by speed.

  • Enter the maximum plausible ground speed for your animals, in meters per second. Remember that your species may be able to travel quite quickly sometimes, for example with favourable wind or currents, and you don't want to accidentally remove these records. Also consider your sampling interval—an animal's maximum speed over a short time interval might be much greater than it could maintain over a longer period. The value you use should represent a speed (in m/s) that it could maintain over the temporal interval in your dataset.
  • Enter the maximum likely location error for your tag type, in meters. Even a fairly small location error could lead to records that suggest a high speed of movement. Also, you might accept a more or less conservative error estimate depending on the type of analysis you will be doing.
  • Choose a filter algorithm. All speed filter algorithms ignore records with no location, undeployed records, and records already marked as outliers (in “manually marked outlier” or “algorithm marked outlier”). A record is considered valid if it meets the filter settings when compared to neighboring records according to the rules described below.

Simple outlier: This method tests the filter settings for each record against the previous and subsequent records. If both neighboring locations require an implausible speed, the record is marked as an outlier. Avoid this algorithm if your tracks are likely to contain groups of >2 consecutive outliers near each other or outliers as the first or last records in the track.

Valid anchor: This method assumes that the first location in the track is accurate (not an outlier). For all other records, it tests the filter settings against the subsequent record. If movement to the next location (from n to n+1) requires an implausible speed, the subsequent record (n+1) is marked as an outlier, and the current record (n) is tested against the next record (n+2), and so on, until a plausible next location is found. Be sure that you have set your deployment start times correctly and that the first location in each track appears correct (if not, you can manually mark them as outliers before running the filter).

Longest consistent track: This method finds the longest sequence of points in the track that is fully consistent. It will “walk through” each record in the dataset and begin a new “candidate track” for the first record and when a record does not pass the filter settings when compared to the previous point. If the current record fits with more than one candidate track based upon the filter settings, it will be assigned to the longest candidate track. After running through the entire track, it will use the longest candidate track as the correct one and flag records not included in this track as outliers. This method makes it possible to catch outliers at the beginning or ends of a track, as well as groups of outliers near each other. Avoid this algorithm if your track may contain an outlier cluster with more records than the entire valid track (for example, in the case of a short tracks with lots of outliers or without correct deployment periods).