NASA's Ocean Biology Distributed Active Archive Center (OB.DAAC) data is free and open to the public. However, we require users to login to the OceanColor Web's data access points using their Earthdata Login credentials in order to download any products.
File Search
Options:
- Web Interface: Use the web interface to select mission search parameters and dates or search for subscriptions and get results returned in the browser.
- API: OB.DAAC offers a file search utility that is accessible through command line interface (CLI). See file search help for usage. for options.
Earthdata Search Tool
A significant amount of our data are now hosted in the EOSDIS Earthdata Search tool. You will need to register with Earthdata Login before any OB.DAAC data may be downloaded. Below are common queries for obtaining data through Earthdata Search.
Find collections from the data provider (OB.DAAC) that contain granules:
https://cmr.earthdata.nasa.gov/search/collections.umm_json?provider=OB_DAAC&has_granules=true&page_size=100
Get page 2, 3 of the data:
https://cmr.earthdata.nasa.gov/search/collections.umm_json?provider=OB_DAAC&has_granules=true&page_size=100&page_num=2
-and-
https://cmr.earthdata.nasa.gov/search/collections.umm_json?provider=OB_DAAC&has_granules=true&page_size=100&page_num=3
With product level:
https://cmr.earthdata.nasa.gov/search/collections.umm_json?provider=OB_DAAC&has_granules=true&processing_level_id=2
Find granules based on data provider and short_name:
https://cmr.earthdata.nasa.gov/search/granules.umm_json?page_size=10&sort_key=short_name&sort_key=start_date&short_name=MODISA_L3b_SST&provider=OB_DAAC
There are other options like 'date', and 'geospatial' that can be sent, as well.
Temporal search:
https://cmr.earthdata.nasa.gov/search/granules.umm_json?page_size=20&sort_key=short_name&sort_key=start_date&short_name=MODISA_L3b_SST&provider=OB_DAAC&temporal=2020-01-03,2020-01-10
Bounding Box:
https://cmr.earthdata.nasa.gov/search/granules.umm_json?page_size=20&sort_key=short_name&sort_key=start_date&short_name=MODISA_L3b_SST&provider=OB_DAAC&bounding_box=-10,-5,10,5&temporal=2020-01-03,2020-01-10
Polygon:
https://cmr.earthdata.nasa.gov/search/granules.umm_json?page_size=20&sort_key=short_name&sort_key=start_date&short_name=MODISA_L3b_SST&provider=OB_DAAC&polygon=10,10,30,10,30,20,10,20,10,10&temporal=2020-01-03,2020-01-10
If you have more questions, search the Earthdata Forum for solutions.
Caveats
- Examples are provided for informational purposes only.
- No product endorsement is implied.
- Results may vary based on the version of software installed on your machine.
- Examples given are not an exhaustive description of possibilities.
Create a .netrc File
Recommended method: Configure your username and password for authentication using a .netrc file. If you experience errors or redirects when using a cookie file, delete any existing cookie files and generate a new one for your current session.
echo "machine urs.earthdata.nasa.gov login USERNAME password PASSWD" > ~/.netrc ; > ~/.urs_cookies chmod 0600 ~/.netrc
where USERNAME and PASSWD are your Earthdata Login credentials.
Generate an AppKey
This allows the application to "remember" you so you don't get prompted by the login screen as often.
- Generate and copy your AppKey.
- Pass the AppKey as a parameter by appending it to the end of the download URL. For example, if the url is:
https://oceandata.sci.gsfc.nasa.gov/ob/getfile/A2021001000000.L1A_LAC.b…
and the Appkey is 'abcd1234', pass the AppKey like:
https://oceandata.sci.gsfc.nasa.gov/ob/getfile/A2021001000000.L1A_LAC.b…
Choose a Download Method
Use a Python Script
The following is an example of how to use python to access data. Equivalent methods exist in the SeaDAS distributed code and can be used if SeaDAS is already installed under $OCSSWROOT/scripts/ProcUtils.py.
usage:
obdaac_download [-h] [-v] [--filelist FILELIST] [--http_manifest HTTP_MANIFEST] [--odir ODIR] [--uncompress] [--force] [filename]
- positional arguments:
filename- name of the file (or the URL of the file) to retrieve
- optional arguments:
-h, --help- show this help message and exit-v, --verbose- print status messages--filelist FILELIST- file containing list of filenames to retrieve, one per line--http_manifest HTTP_MANIFEST- URL to http_manifest file for OB.DAAC data order--odir ODIR- full path to desired output directory; defaults to current working directory.--uncompress- uncompress the retrieved files (if compressed)--appkey APPKEY- value of the users application key--force- force download even if file already exists locally
Provide one of either filename, --filelist or --http_manifest. Note: For authentication, a valid .netrc file in the user home ($HOME) directory or a valid appkey is required.
Example .netrc:
machine urs.earthdata.nasa.gov login USERNAME password PASSWD
- Python Script
#!/usr/bin/env python3 # # A valid .netrc file in the user home ($HOME) directory, or a valid appkey is required. # # Example .netrc: # machine urs.earthdata.nasa.gov login USERNAME password PASSWD # # An appkey can be obtained from: # https://oceandata.sci.gsfc.nasa.gov/appkey/ # # from obdaac_download import httpdl # # server = 'oceandata.sci.gsfc.nasa.gov' # request = '/ob/getfile/T2017004001500.L1A_LAC.bz2' # # status = httpdl(server, request, uncompress=True) # import argparse import hashlib import os import re import sys import subprocess import logging import requests from requests.adapters import HTTPAdapter from datetime import datetime import time import textwrap from urllib.parse import urlparse from pathlib import Path DEFAULT_CHUNK_SIZE = 131072 BLOCKSIZE = 65536 # requests session object used to keep connections around obpgSession = None def getSession(verbose=0, ntries=5): global obpgSession if not obpgSession: # turn on debug statements for requests if verbose > 1: print("Session started") logging.basicConfig(level=logging.DEBUG) obpgSession = requests.Session() obpgSession.mount('https://', HTTPAdapter(max_retries=ntries)) else: if verbose > 1: print("Reusing existing session") return obpgSession def isRequestAuthFailure(req) : ctype = req.headers.get('Content-Type') if ctype and ctype.startswith('text/html'): if "<title>Earthdata Login</title>" in req.text: return True return False def httpdl(server, request, localpath='.', outputfilename=None, ntries=5, uncompress=False, timeout=30., verbose=0, force_download=False, chunk_size=DEFAULT_CHUNK_SIZE): status = 0 urlStr = 'https://' + server + request global obpgSession localpath = Path(localpath) getSession(verbose=verbose, ntries=ntries) modified_since = None headers = {} if not force_download: if outputfilename: ofile = localpath / outputfilename modified_since = get_file_time(ofile) else: rpath = Path(request.rstrip()) if 'requested_files' in request: rpath = Path(request.rstrip().split('?')[0]) ofile = localpath / rpath.name if re.search(r'(?<=\?)(\w+)', ofile.name): ofile = Path(ofile.name.split('?')[0]) modified_since = get_file_time(ofile) if modified_since: headers = {"If-Modified-Since":modified_since.strftime("%a, %d %b %Y %H:%M:%S GMT")} with obpgSession.get(urlStr, stream=True, timeout=timeout, headers=headers) as req: if req.status_code != 200: status = req.status_code elif isRequestAuthFailure(req): status = 401 else: if not Path.exists(localpath): os.umask(0o02) Path.mkdir(localpath, mode=0o2775, parents=True) if not outputfilename: cd = req.headers.get('Content-Disposition') if cd: outputfilename = re.findall("filename=(.+)", cd)[0] else: outputfilename = urlStr.split('/')[-1] ofile = localpath / outputfilename # This is here just in case we didn't get a 304 when we should have... download = True if 'last-modified' in req.headers: remote_lmt = req.headers['last-modified'] remote_ftime = datetime.strptime(remote_lmt, "%a, %d %b %Y %H:%M:%S GMT").replace(tzinfo=None) if modified_since and not force_download: if (remote_ftime - modified_since).total_seconds() < 0: download = False if verbose: print("Skipping download of %s" % outputfilename) if download: total_length = req.headers.get('content-length') length_downloaded = 0 total_length = int(total_length) if verbose >0: print("Downloading %s (%8.2f MBs)" % (outputfilename,total_length /1024/1024)) with open(ofile, 'wb') as fd: for chunk in req.iter_content(chunk_size=chunk_size): if chunk: # filter out keep-alive new chunks length_downloaded += len(chunk) fd.write(chunk) if verbose > 0: percent_done = int(50 * length_downloaded / total_length) sys.stdout.write("\r[%s%s]" % ('=' * percent_done, ' ' * (50-percent_done))) sys.stdout.flush() if uncompress: if ofile.suffix in {'.Z', '.gz', '.bz2'}: if verbose: print("\nUncompressing {}".format(ofile)) compressStatus = uncompressFile(ofile) if compressStatus: status = compressStatus else: status = 0 if verbose: print("\n...Done") return status def uncompressFile(compressed_file): """ uncompress file compression methods: bzip2 gzip UNIX compress """ compProg = {".gz": "gunzip -f ", ".Z": "gunzip -f ", ".bz2": "bunzip2 -f "} exten = Path(compressed_file).suffix unzip = compProg[exten] p = subprocess.Popen(unzip + str(compressed_file.resolve()), shell=True) status = os.waitpid(p.pid, 0)[1] if status: print("Warning! Unable to decompress %s" % compressed_file) return status else: return 0 def get_file_time(localFile): ftime = None localFile = Path(localFile) if not Path.is_file(localFile): while localFile.suffix in {'.Z', '.gz', '.bz2'}: localFile = localFile.with_suffix('') if Path.is_file(localFile): ftime = datetime.fromtimestamp(localFile.stat().st_mtime) return ftime def compare_checksum(filepath,checksum): hasher = hashlib.sha1() with open(filepath, 'rb') as afile: buf = afile.read(BLOCKSIZE) while len(buf) > 0: hasher.update(buf) buf = afile.read(BLOCKSIZE) if hasher.hexdigest() == checksum: return False else: return True def retrieveURL(request,localpath='.', uncompress=False, verbose=0,force_download=False, appkey=False, checksum=False): if args.verbose: print("Retrieving %s" % request.rstrip()) server = "oceandata.sci.gsfc.nasa.gov" parsedRequest = urlparse(request) netpath = parsedRequest.path if parsedRequest.netloc: server = parsedRequest.netloc else: if not re.match(".*getfile",netpath): netpath = '/ob/getfile/' + netpath joiner = '?' if (re.match(".*getfile",netpath)) and appkey: netpath = netpath + joiner +'appkey=' + appkey joiner = '&' if parsedRequest.query: netpath = netpath + joiner + parsedRequest.query status = httpdl(server, netpath, localpath=localpath, uncompress=uncompress, verbose=verbose,force_download=force_download) if checksum and not uncompress: cksumURL = 'https://'+server + '/checkdata/' + parsedRequest.path dnldfile = localpath / parsedRequest.path if compare_checksum(dnldfile,requests.get(cksumURL).text): print("The file %s failed checksum test" % parsedRequest.path) status = 1 return status if __name__ == "__main__": # parse command line parser = argparse.ArgumentParser( formatter_class=argparse.RawTextHelpFormatter, description='Download files archived at the OB.DAAC', epilog=textwrap.dedent(''' Provide one of either filename, --filelist or --http_manifest. NOTE: For authentication, a valid .netrc file in the user home ($HOME) directory\nor a valid appkey is required. Example .netrc: machine urs.earthdata.nasa.gov login USERNAME password PASSWD\n An appkey can be obtained from: https://oceandata.sci.gsfc.nasa.gov/appkey/ ''' )) parser.add_argument('-v', '--verbose', help='print status messages', action='count',default=0) parser.add_argument('filename', nargs='?', help='name of the file (or the URL of the file) to retreive') parser.add_argument('--filelist', help='file containing list of filenames to retreive, one per line') parser.add_argument('--http_manifest', help='URL to http_manifest file for OB.DAAC data order') parser.add_argument('--odir', help='full path to desired output directory; \ndefaults to current working directory: %s' % Path.cwd(), default=Path.cwd()) parser.add_argument('--uncompress',action="store_true", help='uncompress the retrieved files (if compressed)', default=False) parser.add_argument('--checksum',action="store_true", help='compare retrieved file checksum; cannot be used with --uncompress', default=False) parser.add_argument('--failed',help='filename to contain list of files that failed to be retrieved') parser.add_argument('--appkey',help='value of the users application key') parser.add_argument('--force',action='store_true', help='force download even if file already exists locally', default=False) args = parser.parse_args() filelist = [] if args.http_manifest: status = retrieveURL(args.http_manifest,verbose=args.verbose,force_download=True,appkey=args.appkey) if status: print("There was a problem retrieving %s (received status %d)" % (args.http_manifest,status)) sys.exit("Bailing out...") else: with open('http_manifest.txt') as flist: for filename in flist: filelist.append(filename.rstrip()) elif args.filename: filelist.append(args.filename) elif args.filelist: with open(os.path.expandvars(args.filelist)) as flist: for filename in flist: filelist.append(os.path.expandvars(filename.rstrip())) if not len(filelist): parser.print_usage() sys.exit("Please provide a filename (or list file) to retrieve") if args.uncompress and args.checksum: parser.print_usage() sys.exit("--uncompress is incompatible with --checksum") outpath = Path.resolve(Path.expanduser(Path(os.path.expandvars(args.odir)))) if args.verbose: print("Output directory: %s" % outpath) failed = None if args.failed: failed = open(args.failed, 'w') for request in filelist: status = retrieveURL(request,localpath=outpath, uncompress=args.uncompress, verbose=args.verbose,force_download=args.force, appkey=args.appkey,checksum=args.checksum) if status: if status == 304: if args.verbose: print("%s is not newer than local copy, skipping download" % request) else: print("There was a problem retrieving %s (received status %d)" % (request,status)) if failed: failed.write(request) failed.write("\n") if failed: failed.close()
Note: The line terminators in the script may need to be changed to match those accepted by your operating system.
Additional python examples can be found on the Earthdata Wiki.
Use Wget
Examples:
Retrieve a single file using cookies to pass credentials:
wget --load-cookies ~/.urs_cookies --save-cookies ~/.urs_cookies --auth-no-challenge=on --content-disposition 'https://oceandata.sci.gsfc.nasa.gov/ob/getfile/T2017004001500.L1A_LAC.bz2'* Depending on the version of wget installed, the option --auth-no-challenge=on may be required.
* Some users of wget may have issues with IPv6 and may need to include -4 in the wget call to force the use of IPv4.Retrieve a listing of MODIS-Aqua nighttime SST for Jan 5, 2006 using username and password only:
wget -q -O - https://oceandata.sci.gsfc.nasa.gov/MODIS-Aqua/L2/2006/005/ |grep SST4| wget --user=USERNAME --ask-password --auth-no-challenge=on --base https://oceandata.sci.gsfc.nasa.gov/ -N --wait=0.5 --random-wait --force-html -i -
* Depending on the version of wget installed, the option --auth-no-challenge=on may be required.
* Some users of wget may have issues with IPv6 and may need to include -4 in the wget call to force the use of IPv4.Retrieve a single file by interactively passing username and password:
wget --user=USERNAME --ask-password --auth-no-challenge=on https://oceandata.sci.gsfc.nasa.gov/ob/getfile/T2017004001500.L1A_LAC.bz2
* Depending on the version of wget installed, the option --auth-no-challenge=on may be required.
* Some users of wget may have issues with IPv6 and may need to include -4 in the wget call to force the use of IPv4.
Additional Options:
--timeout=10- sets timeout to 10 seconds (by default wget will retry after timeout)--wait=0.5- tells wget to pause for 0.5 seconds between attempts--random-wait- causes the time between requests to vary between 0.5 and 1.5 * wait seconds, where wait was specified using the --wait option-N, --timestamping- prevents wget from downloading files already retrieved if a local copy exists and the remote copy is not newer
Use cURL
Unlike wget, cURL has no method for downloading a list of URLs (although it can download multiple URLs on the command line). However, a shell or script (perl, python, etc) loop can easily be written to preform this task. Examples of use cases are below.
Retrieve a single file using a cookies to pass credentials. This example retrieves a level-1a local area coverage data file for MODIS-Terra for Jan 4, 2017:
curl -O -b ~/.urs_cookies -c ~/.urs_cookies -L -n https://oceandata.sci.gsfc.nasa.gov/ob/getfile/T2017004001500.L1A_LAC.bz2
Retrieve a file list using cookies. This example returns MODIS L2 files for 2006 day 005 (Jan 5th, 2006):
curl https://oceandata.sci.gsfc.nasa.gov/MODIS-Aqua/L2/2006/005/ | grep getfile | cut -d "'" -f 2 | head -1 | xargs -n 1 curl -LJO -n -c ~/.urs_cookies -b ~/.urs_cookies
Still having trouble? Try these resources:
To download the files for an order, you will first need to download the manifest file. The URL for the manifest file will be included in the email that is sent to the registered email address after an order is staged. You can also retrieve this information by clicking the 'View Active' orders button on the Data Dashboard.
Retrieval Options
You can use a program like Wget or cURL to download the files in your manifest file. Note: OB.DAAC requires users to log into their Earthdata Login account before downloading data. Please make sure that you have created a valid .netrc file in your home directory before attempting to execute the examples below.
wget --load-cookies ~/.urs_cookies --save-cookies ~/.urs_cookies --auth-no-challenge=on --no-check-certificate --content-disposition -i manifest-file-name
You can pipe wget commands together to initiate the download:
wget -O - 'manifest-url-from-above' | wget --load-cookies ~/.urs_cookies --save-cookies ~/.urs_cookies --auth-no-challenge=on --no-check-certificate --content-disposition -i -
You can use the python script mentioned in the Download Methods tab to download the manifest file by passing the filename or manifest URL to the script
obdaac_download [-v] [--http_manifest MANIFEST_URL_FROM_EMAIL] [--odir ODIR]
To compile a list of URLs for your subscription and then download the data, use wget. Note: OB.DAAC requires users to register and log into their Earthdata Login account before downloading data. Please make sure that you have created a valid .netrc file in your home directory before attempting to execute the following example.
wget "https://oceandata.sci.gsfc.nasa.gov/api/file_search?subID=SUBSCR&subType=TYPE&format=txt&addurl=1" -O - | wget --load-cookies ~/.urs_cookies --save-cookies ~/.urs_cookies --auth-no-challenge=on --no-check-certificate --content-disposition -i -
where SUBSCR is your subscription request ID and TYPE is the subscription type. Valid options for TYPE are 1 (for non-extracted) or 2 (for extracted).
If you would like to manage your subscription(s), you may do so by visiting the OBPG Subscription Summary page.
If you have questions or experience problems, please direct them to our project forum, but please read the FAQ board first to see if a solution to your issue has already been published in one of them.