Search and Download Methods for Data Archived by OB.DAAC

NASA's Ocean Biology Distributed Active Archive Center (OB.DAAC) data is free and open to the public. However, we require users to login to the OceanColor Web's data access points using their Earthdata Login credentials in order to download any products.

File Search

Options:

Web Interface: Use the web interface to select mission search parameters and dates or search for subscriptions and get results returned in the browser.
API: OB.DAAC offers a file search utility that is accessible through command line interface (CLI). See file search help for usage. for options.

Earthdata Search Tool

A significant amount of our data are now hosted in the EOSDIS Earthdata Search tool. You will need to register with Earthdata Login before any OB.DAAC data may be downloaded. Below are common queries for obtaining data through Earthdata Search.

Find collections from the data provider (OB.DAAC) that contain granules:

https://cmr.earthdata.nasa.gov/search/collections.umm_json?provider=OB_DAAC&has_granules=true&page_size=100

Get page 2, 3 of the data:

https://cmr.earthdata.nasa.gov/search/collections.umm_json?provider=OB_DAAC&has_granules=true&page_size=100&page_num=2

-and-

https://cmr.earthdata.nasa.gov/search/collections.umm_json?provider=OB_DAAC&has_granules=true&page_size=100&page_num=3

With product level:

https://cmr.earthdata.nasa.gov/search/collections.umm_json?provider=OB_DAAC&has_granules=true&processing_level_id=2

Find granules based on data provider and short_name:

https://cmr.earthdata.nasa.gov/search/granules.umm_json?page_size=10&sort_key=short_name&sort_key=start_date&short_name=MODISA_L3b_SST&provider=OB_DAAC

There are other options like 'date', and 'geospatial' that can be sent, as well.

Temporal search:

https://cmr.earthdata.nasa.gov/search/granules.umm_json?page_size=20&sort_key=short_name&sort_key=start_date&short_name=MODISA_L3b_SST&provider=OB_DAAC&temporal=2020-01-03,2020-01-10

Bounding Box:

https://cmr.earthdata.nasa.gov/search/granules.umm_json?page_size=20&sort_key=short_name&sort_key=start_date&short_name=MODISA_L3b_SST&provider=OB_DAAC&bounding_box=-10,-5,10,5&temporal=2020-01-03,2020-01-10

Polygon:

https://cmr.earthdata.nasa.gov/search/granules.umm_json?page_size=20&sort_key=short_name&sort_key=start_date&short_name=MODISA_L3b_SST&provider=OB_DAAC&polygon=10,10,30,10,30,20,10,20,10,10&temporal=2020-01-03,2020-01-10

If you have more questions, search the Earthdata Forum for solutions.

Caveats

Examples are provided for informational purposes only.
No product endorsement is implied.
Results may vary based on the version of software installed on your machine.
Examples given are not an exhaustive description of possibilities.

Create a .netrc File

Recommended method: Configure your username and password for authentication using a .netrc file. If you experience errors or redirects when using a cookie file, delete any existing cookie files and generate a new one for your current session.

echo "machine urs.earthdata.nasa.gov login USERNAME password PASSWD" > ~/.netrc ; > ~/.urs_cookies
chmod  0600 ~/.netrc

where USERNAME and PASSWD are your Earthdata Login credentials.

Generate an AppKey

This allows the application to "remember" you so you don't get prompted by the login screen as often.

Generate and copy your AppKey.
Pass the AppKey as a parameter by appending it to the end of the download URL. For example, if the url is:

https://oceandata.sci.gsfc.nasa.gov/ob/getfile/A2021001000000.L1A_LAC.b…

and the Appkey is 'abcd1234', pass the AppKey like:

https://oceandata.sci.gsfc.nasa.gov/ob/getfile/A2021001000000.L1A_LAC.b…

Choose a Download Method

Use a Python Script

The following is an example of how to use python to access data. Equivalent methods exist in the SeaDAS distributed code and can be used if SeaDAS is already installed under $OCSSWROOT/scripts/ProcUtils.py.

usage:

obdaac_download [-h] [-v] [--filelist FILELIST] [--http_manifest HTTP_MANIFEST] [--odir ODIR] [--uncompress] [--force] [filename]

positional arguments:
- filename - name of the file (or the URL of the file) to retrieve
optional arguments:
- -h, --help - show this help message and exit
- -v, --verbose - print status messages
- --filelist FILELIST - file containing list of filenames to retrieve, one per line
- --http_manifest HTTP_MANIFEST - URL to http_manifest file for OB.DAAC data order
- --odir ODIR - full path to desired output directory; defaults to current working directory.
- --uncompress - uncompress the retrieved files (if compressed)
- --appkey APPKEY - value of the users application key
- --force - force download even if file already exists locally

Provide one of either filename, --filelist or --http_manifest. Note: For authentication, a valid .netrc file in the user home ($HOME) directory or a valid appkey is required.

Example .netrc:
machine urs.earthdata.nasa.gov login USERNAME password PASSWD

Obtain an appkey here.

Python Script

#!/usr/bin/env python3
#
# A valid .netrc file in the user home ($HOME) directory, or a valid appkey is required.
#
#   Example .netrc:
#    machine urs.earthdata.nasa.gov login USERNAME password PASSWD
#
#   An appkey can be obtained from:
#    https://oceandata.sci.gsfc.nasa.gov/appkey/
#
# from obdaac_download import httpdl
#
# server = 'oceandata.sci.gsfc.nasa.gov'
# request = '/ob/getfile/T2017004001500.L1A_LAC.bz2'
#
# status = httpdl(server, request, uncompress=True)
#
import argparse
import hashlib
import os
import re
import sys
import subprocess
import logging
import requests
from requests.adapters import HTTPAdapter
from datetime import datetime
import time
import textwrap
from urllib.parse import urlparse
from pathlib import Path
DEFAULT_CHUNK_SIZE = 131072
BLOCKSIZE = 65536
# requests session object used to keep connections around
obpgSession = None
def getSession(verbose=0, ntries=5):
   global obpgSession
   if not obpgSession:
       # turn on debug statements for requests
       if verbose > 1:
           print("Session started")
           logging.basicConfig(level=logging.DEBUG)
       obpgSession = requests.Session()
       obpgSession.mount('https://', HTTPAdapter(max_retries=ntries))
   else:
       if verbose > 1:
           print("Reusing existing session")
   return obpgSession
def isRequestAuthFailure(req) :
   ctype = req.headers.get('Content-Type')
   if ctype and ctype.startswith('text/html'):
       if "<title>Earthdata Login</title>" in req.text:
           return True
   return False
def httpdl(server, request, localpath='.', outputfilename=None, ntries=5,
          uncompress=False, timeout=30., verbose=0, force_download=False,
          chunk_size=DEFAULT_CHUNK_SIZE):
   status = 0
   urlStr = 'https://' + server + request
   global obpgSession
   localpath = Path(localpath)
   getSession(verbose=verbose, ntries=ntries)
   modified_since = None
   headers = {}
   if not force_download:
       if outputfilename:
           ofile = localpath / outputfilename
           modified_since = get_file_time(ofile)
       else:
           rpath = Path(request.rstrip())
           if 'requested_files' in request:
               rpath = Path(request.rstrip().split('?')[0])
           ofile = localpath / rpath.name
           if re.search(r'(?<=\?)(\w+)', ofile.name):
               ofile = Path(ofile.name.split('?')[0])
           modified_since = get_file_time(ofile)
       if modified_since:
           headers = {"If-Modified-Since":modified_since.strftime("%a, %d %b %Y %H:%M:%S GMT")}
   with obpgSession.get(urlStr, stream=True, timeout=timeout, headers=headers) as req:
       if req.status_code != 200:
           status = req.status_code
       elif isRequestAuthFailure(req):
           status = 401
       else:
           if not Path.exists(localpath):
               os.umask(0o02)
               Path.mkdir(localpath, mode=0o2775, parents=True)
           if not outputfilename:
               cd = req.headers.get('Content-Disposition')
               if cd:
                   outputfilename = re.findall("filename=(.+)", cd)[0]
               else:
                   outputfilename = urlStr.split('/')[-1]
           ofile = localpath / outputfilename
           # This is here just in case we didn't get a 304 when we should have...
           download = True
           if 'last-modified' in req.headers:
               remote_lmt = req.headers['last-modified']
               remote_ftime = datetime.strptime(remote_lmt, "%a, %d %b %Y %H:%M:%S GMT").replace(tzinfo=None)
               if modified_since and not force_download:
                   if (remote_ftime - modified_since).total_seconds() < 0:
                       download = False
                       if verbose:
                           print("Skipping download of %s" % outputfilename)
           if download:
               total_length = req.headers.get('content-length')
               length_downloaded = 0
               total_length = int(total_length)
               if verbose >0:
                   print("Downloading %s (%8.2f MBs)" % (outputfilename,total_length /1024/1024))
               with open(ofile, 'wb') as fd:
                   for chunk in req.iter_content(chunk_size=chunk_size):
                       if chunk: # filter out keep-alive new chunks
                           length_downloaded += len(chunk)
                           fd.write(chunk)
                           if verbose > 0:
                               percent_done = int(50 * length_downloaded / total_length)
                               sys.stdout.write("\r[%s%s]" % ('=' * percent_done, ' ' * (50-percent_done)))
                               sys.stdout.flush()
               if uncompress:
                   if ofile.suffix in {'.Z', '.gz', '.bz2'}:
                       if verbose:
                           print("\nUncompressing {}".format(ofile))
                       compressStatus = uncompressFile(ofile)
                       if compressStatus:
                           status = compressStatus
               else:
                   status = 0
               if verbose:
                   print("\n...Done")
   return status

def uncompressFile(compressed_file):
   """
   uncompress file
   compression methods:
       bzip2
       gzip
       UNIX compress
   """
   compProg = {".gz": "gunzip -f ", ".Z": "gunzip -f ", ".bz2": "bunzip2 -f "}
   exten = Path(compressed_file).suffix
   unzip = compProg[exten]
   p = subprocess.Popen(unzip + str(compressed_file.resolve()), shell=True)
   status = os.waitpid(p.pid, 0)[1]
   if status:
       print("Warning! Unable to decompress %s" % compressed_file)
       return status
   else:
       return 0
def get_file_time(localFile):
   ftime = None
   localFile = Path(localFile)
   if not Path.is_file(localFile):
       while localFile.suffix in {'.Z', '.gz', '.bz2'}:
           localFile = localFile.with_suffix('')
   if Path.is_file(localFile):
       ftime = datetime.fromtimestamp(localFile.stat().st_mtime)
   return ftime
def compare_checksum(filepath,checksum):
   hasher = hashlib.sha1()
   with open(filepath, 'rb') as afile:
       buf = afile.read(BLOCKSIZE)
       while len(buf) > 0:
           hasher.update(buf)
           buf = afile.read(BLOCKSIZE)
   
   if hasher.hexdigest() == checksum:
       return False
   else:
       return True
def retrieveURL(request,localpath='.', uncompress=False, verbose=0,force_download=False, appkey=False, checksum=False):
   if args.verbose:
       print("Retrieving %s" % request.rstrip())
   server = "oceandata.sci.gsfc.nasa.gov"
   parsedRequest = urlparse(request)
   netpath = parsedRequest.path
   if parsedRequest.netloc:
       server = parsedRequest.netloc
   else:
       if not re.match(".*getfile",netpath):
           netpath = '/ob/getfile/' + netpath
   joiner = '?'
   if (re.match(".*getfile",netpath)) and appkey:
       netpath = netpath + joiner +'appkey=' + appkey
       joiner = '&'
   if parsedRequest.query:
       netpath = netpath + joiner + parsedRequest.query
   status = httpdl(server, netpath, localpath=localpath, uncompress=uncompress, verbose=verbose,force_download=force_download)
   
   if checksum and not uncompress:
       cksumURL = 'https://'+server + '/checkdata/' + parsedRequest.path
       dnldfile = localpath / parsedRequest.path
       if compare_checksum(dnldfile,requests.get(cksumURL).text):
           print("The file %s failed checksum test" % parsedRequest.path)
           status = 1
   return status
if __name__ == "__main__":
   # parse command line
   parser = argparse.ArgumentParser(
       formatter_class=argparse.RawTextHelpFormatter,
       description='Download files archived at the OB.DAAC',
       epilog=textwrap.dedent('''
Provide one of either filename, --filelist or --http_manifest.
NOTE: For authentication, a valid .netrc file in the user home ($HOME) directory\nor a valid appkey is required.
   Example .netrc:
   machine urs.earthdata.nasa.gov login USERNAME password PASSWD\n
   An appkey can be obtained from:
   https://oceandata.sci.gsfc.nasa.gov/appkey/
'''
   ))
   parser.add_argument('-v', '--verbose', help='print status messages',
                       action='count',default=0)
   parser.add_argument('filename', nargs='?', help='name of the file (or the URL of the file) to retreive')
   parser.add_argument('--filelist',
                       help='file containing list of filenames to retreive, one per line')
   parser.add_argument('--http_manifest',
                       help='URL to http_manifest file for OB.DAAC data order')
   parser.add_argument('--odir',
                       help='full path to desired output directory; \ndefaults to current working directory: %s' % Path.cwd(),
                       default=Path.cwd())
   parser.add_argument('--uncompress',action="store_true",
                       help='uncompress the retrieved files (if compressed)',
                       default=False)
   parser.add_argument('--checksum',action="store_true",
                       help='compare retrieved file checksum; cannot be used with --uncompress',
                       default=False)
   parser.add_argument('--failed',help='filename to contain list of files that failed to be retrieved')
   parser.add_argument('--appkey',help='value of the users application key')
   parser.add_argument('--force',action='store_true',
                       help='force download even if file already exists locally',
                       default=False)
   args = parser.parse_args()
   filelist = []
   if args.http_manifest:
       status = retrieveURL(args.http_manifest,verbose=args.verbose,force_download=True,appkey=args.appkey)
       if status:
           print("There was a problem retrieving %s (received status %d)" % (args.http_manifest,status))
           sys.exit("Bailing out...")
       else:
           with open('http_manifest.txt') as flist:
               for filename in flist:
                   filelist.append(filename.rstrip())
   elif args.filename:
       filelist.append(args.filename)
   elif args.filelist:
       with open(os.path.expandvars(args.filelist)) as flist:
           for filename in flist:
               filelist.append(os.path.expandvars(filename.rstrip()))
   if not len(filelist):
       parser.print_usage()
       sys.exit("Please provide a filename (or list file) to retrieve")
   if args.uncompress and args.checksum:
       parser.print_usage()
       sys.exit("--uncompress is incompatible with --checksum")
   outpath = Path.resolve(Path.expanduser(Path(os.path.expandvars(args.odir))))
   if args.verbose:
       print("Output directory: %s" % outpath)
   failed = None
   if args.failed:
       failed = open(args.failed, 'w')
   for request in filelist:
       status = retrieveURL(request,localpath=outpath, uncompress=args.uncompress,
                            verbose=args.verbose,force_download=args.force,
                            appkey=args.appkey,checksum=args.checksum)
       if status:
           if status == 304:
               if args.verbose:
                   print("%s is not newer than local copy, skipping download" % request)
           else:
               print("There was a problem retrieving %s (received status %d)" % (request,status))
               if failed:
                   failed.write(request)
                   failed.write("\n")
   if failed:
       failed.close()

Note: The line terminators in the script may need to be changed to match those accepted by your operating system.

Additional python examples can be found on the Earthdata Wiki.

Use Wget

Install | Learn

Examples:

Retrieve a single file using cookies to pass credentials:
```
        wget --load-cookies ~/.urs_cookies --save-cookies ~/.urs_cookies --auth-no-challenge=on --content-disposition 'https://oceandata.sci.gsfc.nasa.gov/ob/getfile/T2017004001500.L1A_LAC.bz2'
        
```
* Depending on the version of wget installed, the option --auth-no-challenge=on may be required.
* Some users of wget may have issues with IPv6 and may need to include -4 in the wget call to force the use of IPv4.
Retrieve a listing of MODIS-Aqua nighttime SST for Jan 5, 2006 using username and password only:
```
wget -q -O - https://oceandata.sci.gsfc.nasa.gov/MODIS-Aqua/L2/2006/005/ |grep SST4| wget --user=USERNAME --ask-password --auth-no-challenge=on --base https://oceandata.sci.gsfc.nasa.gov/ -N --wait=0.5 --random-wait --force-html -i -
```
* Depending on the version of wget installed, the option --auth-no-challenge=on may be required.
* Some users of wget may have issues with IPv6 and may need to include -4 in the wget call to force the use of IPv4.
Retrieve a single file by interactively passing username and password:
```
wget --user=USERNAME --ask-password --auth-no-challenge=on https://oceandata.sci.gsfc.nasa.gov/ob/getfile/T2017004001500.L1A_LAC.bz2
```
* Depending on the version of wget installed, the option --auth-no-challenge=on may be required.
* Some users of wget may have issues with IPv6 and may need to include -4 in the wget call to force the use of IPv4.

Additional Options:

--timeout=10 - sets timeout to 10 seconds (by default wget will retry after timeout)
--wait=0.5 - tells wget to pause for 0.5 seconds between attempts
--random-wait - causes the time between requests to vary between 0.5 and 1.5 * wait seconds, where wait was specified using the --wait option
-N, --timestamping - prevents wget from downloading files already retrieved if a local copy exists and the remote copy is not newer

Use cURL

Install | Learn

Unlike wget, cURL has no method for downloading a list of URLs (although it can download multiple URLs on the command line). However, a shell or script (perl, python, etc) loop can easily be written to preform this task. Examples of use cases are below.

Retrieve a single file using a cookies to pass credentials. This example retrieves a level-1a local area coverage data file for MODIS-Terra for Jan 4, 2017:
```
curl -O -b ~/.urs_cookies -c ~/.urs_cookies -L -n https://oceandata.sci.gsfc.nasa.gov/ob/getfile/T2017004001500.L1A_LAC.bz2
```

Retrieve a file list using cookies. This example returns MODIS L2 files for 2006 day 005 (Jan 5th, 2006):

curl https://oceandata.sci.gsfc.nasa.gov/MODIS-Aqua/L2/2006/005/ | grep getfile | cut -d "'" -f 2 | head -1 | xargs -n 1 curl -LJO -n -c ~/.urs_cookies -b ~/.urs_cookies

Still having trouble? Try these resources:

To download the files for an order, you will first need to download the manifest file. The URL for the manifest file will be included in the email that is sent to the registered email address after an order is staged. You can also retrieve this information by clicking the 'View Active' orders button on the Data Dashboard.

Retrieval Options

You can use a program like Wget or cURL to download the files in your manifest file. Note: OB.DAAC requires users to log into their Earthdata Login account before downloading data. Please make sure that you have created a valid .netrc file in your home directory before attempting to execute the examples below.
```
wget --load-cookies ~/.urs_cookies --save-cookies ~/.urs_cookies --auth-no-challenge=on --no-check-certificate --content-disposition -i manifest-file-name
```

You can pipe wget commands together to initiate the download:

wget -O - 'manifest-url-from-above' | wget --load-cookies ~/.urs_cookies --save-cookies ~/.urs_cookies --auth-no-challenge=on --no-check-certificate --content-disposition -i -

You can use the python script mentioned in the Download Methods tab to download the manifest file by passing the filename or manifest URL to the script
```
obdaac_download [-v] [--http_manifest MANIFEST_URL_FROM_EMAIL] [--odir ODIR]
```

To compile a list of URLs for your subscription and then download the data, use wget. Note: OB.DAAC requires users to register and log into their Earthdata Login account before downloading data. Please make sure that you have created a valid .netrc file in your home directory before attempting to execute the following example.

wget "https://oceandata.sci.gsfc.nasa.gov/api/file_search?subID=SUBSCR&subType=TYPE&format=txt&addurl=1" -O - | wget --load-cookies ~/.urs_cookies --save-cookies ~/.urs_cookies --auth-no-challenge=on --no-check-certificate --content-disposition -i -

where SUBSCR is your subscription request ID and TYPE is the subscription type. Valid options for TYPE are 1 (for non-extracted) or 2 (for extracted).

If you would like to manage your subscription(s), you may do so by visiting the OBPG Subscription Summary page.

If you have questions or experience problems, please direct them to our project forum, but please read the FAQ board first to see if a solution to your issue has already been published in one of them.

Details

Last Updated

Dec. 9, 2025

Published

Feb. 2, 2024

Data Archive

Ocean Biology DAAC (OB.DAAC)