diff --git a/_toc.yml b/_toc.yml index 3e35b968d..b28db575a 100644 --- a/_toc.yml +++ b/_toc.yml @@ -89,3 +89,4 @@ parts: - file: notebooks/WFC3/point_spread_function.md sections: - file: notebooks/WFC3/point_spread_function/hst_point_spread_function.ipynb + - file: notebooks/WFC3/mast_api_psf/download_psf_cutouts.ipynb diff --git a/notebooks/WFC3/README.md b/notebooks/WFC3/README.md index 8d019c07b..db83e7354 100644 --- a/notebooks/WFC3/README.md +++ b/notebooks/WFC3/README.md @@ -29,8 +29,9 @@ Photometry: - [Calculating WFC3 Zeropoints with `stsynphot`](https://spacetelescope.github.io/hst_notebooks/notebooks/WFC3/zeropoints/zeropoints.html) - [WFC3/UVIS Pixel Area Map Corrections for Subarrays](https://spacetelescope.github.io/hst_notebooks/notebooks/WFC3/uvis_pam_corrections/WFC3_UVIS_Pixel_Area_Map_Corrections_for_Subarrays.html) -Point Spread Function: +Point Spread Function (PSF): - [HST WFC3 Point Spread Function Modeling](https://spacetelescope.github.io/hst_notebooks/notebooks/WFC3/point_spread_function/hst_point_spread_function.html) + - [Downloading WFC3 and WFPC2 PSF Cutouts from MAST](https://spacetelescope.github.io/hst_notebooks/notebooks/WFC3/mast_api_psf/download_psf_cutouts.html) See the [WFC3 Instrument Handbook](https://hst-docs.stsci.edu/wfc3ihb), [WFC3 Data Handbook](https://hst-docs.stsci.edu/wfc3dhb), @@ -54,10 +55,6 @@ Space Telescope (HST) and the James Webb Space Telescope (JWST). To install, see [stenv readthedocs](https://stenv.readthedocs.io/en/latest/) or [stenv GitHub](https://github.com/spacetelescope/stenv). -`hst_notebooks/notebooks_env` is the default virtual environment for HST Notebooks, -which contains the same scientific computing libraries in `stenv`, but not the HST and -JWST libraries. This environment can also be used as a base, but is not recommended. - In addition, each notebook contains a `requirements.txt` file that needs to be installed before running the notebooks. Some notebooks contain a `pre-requirements.sh` file, usually to install [HSTCAL](https://github.com/spacetelescope/hstcal), which diff --git a/notebooks/WFC3/mast_api_psf/download_psf_cutouts.ipynb b/notebooks/WFC3/mast_api_psf/download_psf_cutouts.ipynb new file mode 100644 index 000000000..80d705f16 --- /dev/null +++ b/notebooks/WFC3/mast_api_psf/download_psf_cutouts.ipynb @@ -0,0 +1,559 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "\n", + "# Downloading WFC3 and WFPC2 PSF Cutouts from MAST\n", + "\n", + "***\n", + "## Learning Goals\n", + "This notebook demonstrates how to download PSF cutouts (i.e. \"realizations\" of the PSF) from the WFC3 and WFPC2 PSF Databases on the MAST Portal using the MAST API. By the end of this tutorial, you will:\n", + "\n", + "- Query the database for source metadata.\n", + "- Download source cutouts from reconstructed dataURIs.\n", + "- Extract source cutouts from dataURLs.\n", + "\n", + "Acronyms:\n", + "- Hubble Space Telescope (HST)\n", + "- Wide Field Camera 3 (WFC3)\n", + "- Wide Field and Planetary Camera 2 (WFPC2)\n", + "- WFC3 Ultraviolet and VISable detector (WFC3/UVIS or UVIS)\n", + "- WFC3 InfraRed detector (WFC3/IR or IR)\n", + "- Point Spread Function (PSF)\n", + "- Mikulski Archive for Space Telescopes (MAST)\n", + "- Application Programming Interface (API)\n", + "\n", + "## Table of Contents\n", + "\n", + "[Introduction](#intro)
\n", + "[1. Imports](#import)
\n", + "[2. Query the WFC3 and WFPC2 PSF Databases](#query)
\n", + "[3. Reconstruct dataURIs](#reconstruct)
\n", + "[4. Download and extract cutouts using dataURIs](#download)
\n", + "- [4.1 Single file](#single)
\n", + "- [4.2 Multiple files: bundle](#bundle)
\n", + "- [4.3 Multiple files: pooling](#pool)
\n", + "\n", + "[5. Extracting cutouts using dataURLs](#url)
\n", + "[6. Load and plot cutouts](#plot)
\n", + "[7. Conclusions](#conclusions)
\n", + "[Additional Resources](#add)
\n", + "[About this Notebook](#about)
\n", + "[Citations](#cite)
" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "\n", + "## Introduction\n", + "\n", + "The WFC3 and WFPC2 PSF Databases are three databases (WFC3/UVIS, WFC3/IR, and WFPC2) of sources measured in every external observation from the instruments, excluding proprietary data. All sources were measured using [HST1PASS](https://www.stsci.edu/files/live/sites/www/files/home/hst/instrumentation/wfc3/documentation/instrument-science-reports-isrs/_documents/2022/WFC3-ISR-2022-05.pdf), a Fortran program that measures point sources using HST PSF models and one-pass photometry. These point sources are \"realizations\" of the PSF, meaning they can be used to construct detailed PSF models. [WFC3 ISR 2021-12](https://www.stsci.edu/files/live/sites/www/files/home/hst/instrumentation/wfc3/documentation/instrument-science-reports-isrs/_documents/2021/ISR_2021_12.pdf) (Dauphin et al. 2021) provides a detailed overview of the database pipeline and a statistical analysis of the databases up to 2021. As of August 2024, the databases have over 83.5 million sources, including both unsaturated and saturated sources. The databases are summarized below:\n", + "- WFC3/UVIS: 33M sources (30M unsaturated and 3M saturated)\n", + "- WFC3/IR: 25.5M sources (25.3M unsaturated and 0.2M saturated)\n", + "- WFPC2: 25M sources (15M unsaturated and 10M saturated)\n", + "\n", + "The databases are available on the [MAST Portal](https://mast.stsci.edu/portal/Mashup/Clients/Mast/Portal.html) under the collections \"WFC3 PSF\" (with wavebands \"UVIS\" and \"IR\") and \"WFPC2 PSF\". By clicking \"Advanced Search\", the databases can be filtered and queried by various parameters, such as source position and flux. All of the searchable field options are described [here](https://mast.stsci.edu/api/v0/_w_f_c3__p_s_ffields.html). After completing the query, the sources' measurables (i.e. metadata) and cutouts centered on the source can be retrieved and downloaded, using either raw or calibrated data.\n", + "\n", + "Although the MAST Portal is extremely effective for a variety of purposes, we introduce a programmatic way of downloading metadata and cutouts, which can be useful for downstream tasks and increasing data accessibility." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "\n", + "## 1. Imports\n", + " \n", + "This notebook assumes you have installed the required libraries as described [here](https://github.com/spacetelescope/hst_notebooks/blob/main/notebooks/WFC3/mast_api_psf/requirements.txt).\n", + "\n", + "We import:\n", + "- `glob` for querying directories\n", + "- `os` for handling files\n", + "- `tarfile` for extracting the contents of a .tar.gz file\n", + "- `numpy` for handling arrays\n", + "- `matplotlib.pyplot` for plotting data\n", + "- `astropy.io fits` for accessing FITS files\n", + "\n", + "We also import a custom module `mast_api_psf.py` for querying sources and downloading cutouts." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "import os\n", + "import tarfile\n", + "\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "from astropy.io import fits\n", + "\n", + "import mast_api_psf" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "## 2. Query the WFC3 and WFPC2 PSF Databases\n", + "\n", + "First, we query a database for source metadata. For this notebook, we use WFC3/UVIS (i.e. `UVIS`) as an example. The same syntax and functionality also works for WFC3/IR and WFPC2 (i.e. `IR` and `WFPC2`, respectively)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "detector = 'UVIS'" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's retrieve a small subset of sources by filtering our query. We retrieve all unsaturated sources centered on pixels between 100 to 102 on both the x and y coordinates of the detector. We format our min and max values using `mast_api_psf.set_min_max` from [Using the MAST API with Python](https://mast.stsci.edu/api/v0/pyex.html#set_min_max)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "center_min_max = mast_api_psf.set_min_max(100, 102)\n", + "center_min_max" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We define our parameters to be filtered. As a reminder, all of the columns that can be filtered are described [here](https://mast.stsci.edu/api/v0/_w_f_c3__p_s_ffields.html).\n", + "\n", + "**Note: there are a few column differences between the databases.**\n", + "- the column corresponding to filters (`filter`) in the WFC3 databases is `filter_1` in the WFPC2 database. For WFPC2, our module corrects `filter` to `filter_1` in case the former was used by accident.\n", + "- the secondary filter column `filter_2` is only available for WFPC2. For special WFPC2 observations, the user can utilize two filters at once as long as both filters are on different wheels. The most common case is using a standard optical filter with a polarizer.\n", + "- the proposal type column `proptype` is only available for WFC3." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "parameters = {\n", + " 'psf_x_center': center_min_max,\n", + " 'psf_y_center': center_min_max,\n", + " 'n_sat_pixels': ['0']\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For reference, `n_sat_pixels` is the number of saturated pixels the source contains. 0 indicates no saturated pixels (i.e. unsaturated). Any number greater than 0 indicates a saturated source with that many saturated pixels. \n", + "\n", + "We format our filters using `mast_api_psf.set_filters` from [Using the MAST API with Python](https://mast.stsci.edu/api/v0/pyex.html#set_filters)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "filts = mast_api_psf.set_filters(parameters)\n", + "filts" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, we query MAST by wrapping their API to retrieve our filtered sources using `mast_api_psf.mast_query_psf_database`. By default, this function returns all columns for the query. The columns can be changed using the parameter `columns`, which is a list of the columns to be returned. Here, we use the minimum number of columns necessary to reconstruct dataURIs.\n", + "\n", + "**Warning: the time it takes to query MAST depends on connectivity, the number of sources to retrieve, and the number of columns returned.**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "columns = ['id', 'rootname', 'filter', 'x_cal', 'y_cal', 'x_raw', 'y_raw', 'chip', 'qfit', 'subarray']\n", + "obs = mast_api_psf.mast_query_psf_database(detector=detector, filts=filts, columns=columns)\n", + "print(f'Number of sources queried: {len(obs)}')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "## 3. Reconstruct dataURIs\n", + "\n", + "Now that we retrieved our queried sources, we create dataURIs, or paths to their source on the MAST server, to download their respective cutouts using `mast_api_psf.make_dataURIs` and the metadata.\n", + "\n", + "We support two data types for WFPC2 (raw, calibrated), and three data types for WFC3 (raw, calibrated, charge transfer efficiency (CTE) corrected). These data types are indicated by unique file suffixes:\n", + "- `raw` for raw WFC3 data\n", + "- `d0m` for raw WFPC2 data\n", + "- `flt` for calibrated WFC3 data\n", + "- `c0m` for calibrated WFPC2 data\n", + "- `flc` for calibrated, CTE corrected WFC3/UVIS data (a similar option is not available for WFC3/IR or WFPC2)\n", + "\n", + "Here, we reconstruct dataURIs for just calibrated (`flt`) data. By default, this function calls for 51x51 and 101x101 cutouts for unsaturated and saturated sources, respectively. The sizes can be changed within the function using the parameters `unsat_size` and `sat_size` as integers (i.e. `unsat_size=51, sat_size=101`)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "file_suffix = ['flt']\n", + "dataURIs = mast_api_psf.make_dataURIs(obs, detector=detector, file_suffix=file_suffix)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "## 4. Download and extract cutouts using dataURIs\n", + "\n", + "With the dataURIs, we download the respective cutouts using three different functions as derivatives from [Using the MAST API with Python](https://mast.stsci.edu/api/v0/pyex.html#download_req). \n", + "\n", + "**Warning: the time it takes to download cutouts from MAST depends on connectivity and the number of sources to retrieve.**\n", + "\n", + "\n", + "### 4.1 Single file\n", + "\n", + "First, we download a single cutout using `mast_api_psf.download_request_file`, which downloads to the current working directory." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dataURI = dataURIs[0]\n", + "filename = dataURI.split('/')[-1]\n", + "filename_cutout = mast_api_psf.download_request_file([dataURI, filename])\n", + "filename_cutout" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "### 4.2 Multiple files: bundle\n", + "\n", + "Next, we download multiple cutouts using `mast_api_psf.download_request_bundle`, which downloads as a `.tar.gz` file that can later be extracted. We recommend using this to download hundreds of cutouts. A standard laptop and network bandwith can download a bundle of 1000 cutouts in ~30 seconds." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "filename_bundle = mast_api_psf.download_request_bundle(dataURIs, filename='mastDownload.tar.gz')\n", + "filename_bundle" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With the `.tar.gz` file downloaded, we safely extract the cutouts." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "with tarfile.open(filename_bundle, 'r:gz') as tar:\n", + " path_mast = tar.getnames()[0]\n", + " print(f'Path to MAST PSF Cutouts: {path_mast}')\n", + " tar.extractall(filter='data')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "### 4.3 Multiple files: pooling\n", + "\n", + "Lastly, we download multiple cutouts using `mast_api_psf.download_request_pool`, which downloads cutouts to a new directory indicated by the date, similar to the directory name of the extracted `.tar.gz` file. Although this method is ~1.5 times slower than bundle, we recommend using this to download thousands of cutouts as the progress bar can be helpful keeping track of how much longer the downloads will take. This function utilizes all available CPUs by default. Changing the parameter `cpu_count` sets the number of CPUs.\n", + "\n", + "**Warning: Interrupting the kernel will not kill the multiprocessing and will keep downloading cutouts. To kill the multiprocessing, restart the kernel.**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "mast_api_psf.download_request_pool(dataURIs)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "## 5. Extracting cutouts using dataURLs\n", + "\n", + "If downloading the cutouts are unnecessary, we can also extract the cutouts directly using dataURLs, or links to their sources on the [MAST website](https://archive.stsci.edu/).\n", + "\n", + "First, we convert the dataURIs to dataURLs using `mast_api_psf.convert_dataURIs_to_dataURLs`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dataURLs = mast_api_psf.convert_dataURIs_to_dataURLs(dataURIs)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then, we extract a single cutout using `fits.getdata`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dataURL = dataURLs[0]\n", + "cutout_URL = fits.getdata(dataURL)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, we extract all of the cutouts from the dataURLs using `mast_api_psf.extract_cutouts_pool`. Similarly to `mast_api_psf.download_request_pool`, this function performs multiprocessing to retrieve all the cutouts, and has the same parameter `cpu_count` to set the number of CPUs." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "cutouts = mast_api_psf.extract_cutouts_pool(dataURLs)\n", + "print(f'Number of cutouts: {len(cutouts)}')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "## 6. Load and plot cutouts\n", + "\n", + "Since the cutouts have been downloaded, we load them into the notebook. For this example, we only load the single cutout downloaded in the first example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "cutout_URI = fits.getdata(filename_cutout)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, we plot the cutout in log scaling." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "file = os.path.basename(filename_cutout)\n", + "plt.title(file)\n", + "plt.imshow(np.log10(cutout_URI), origin='lower', cmap='gray')\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For a final check, we show that the dataURI cutout is the same as the dataURL cutout." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "diff = (cutout_URI != cutout_URL).sum()\n", + "print(f'There are {diff} different pixels.')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "## 7. Conclusions\n", + "\n", + "Thank you for walking through this notebook. Now, you should be familiar with:\n", + "\n", + "- Querying the WFC3 and WFPC2 PSF Databases for source metadata.\n", + "- Reconstructing dataURIs and dataURLs to open source cutouts.\n", + "- Downloading, extracting, loading, and plotting the cutouts.\n", + "\n", + "**Congratulations, you have completed the notebook.**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "## Additional Resources\n", + "\n", + "Be sure to check out our [complimentary notebook](https://github.com/spacetelescope/hst_notebooks/tree/main/notebooks/WFC3/point_spread_function) on HST WFC3 PSF Modeling for a variety of science use cases (Revalski 2024).\n", + "\n", + "Below are some additional resources that may be helpful. Please send any questions through the [HST Help Desk](https://stsci.service-now.com/hst) or [open a ticket on HST Notebooks](https://github.com/spacetelescope/hst_notebooks/issues).\n", + "\n", + "**WFC3**\n", + "- [WFC3 Website](https://www.stsci.edu/hst/instrumentation/wfc3)\n", + " - [WFC3 PSF Website](https://www.stsci.edu/hst/instrumentation/wfc3/data-analysis/psf)\n", + "- [WFC3 Instrument Handbook](https://hst-docs.stsci.edu/wfc3ihb)\n", + " - [Chapter 6.6 UVIS Optical Performance](https://hst-docs.stsci.edu/wfc3ihb/chapter-6-uvis-imaging-with-wfc3/6-6-uvis-optical-performance)\n", + " - [Chapter 7.6 IR Optical Performance](https://hst-docs.stsci.edu/wfc3ihb/chapter-7-ir-imaging-with-wfc3/7-6-ir-optical-performance)\n", + "- [WFC3 Data Handbook](https://hst-docs.stsci.edu/wfc3dhb)\n", + "- [WFC3 Instrument Science Reports](https://www.stsci.edu/hst/instrumentation/wfc3/documentation/instrument-science-reports-isrs)\n", + " - [WFC3 ISR 2022-05](https://www.stsci.edu/files/live/sites/www/files/home/hst/instrumentation/wfc3/documentation/instrument-science-reports-isrs/_documents/2022/WFC3-ISR-2022-05.pdf): One-Pass HST Photometry with hst1pass (Anderson 2022)\n", + " - [WFC3 ISR 2021-12](https://www.stsci.edu/files/live/sites/www/files/home/hst/instrumentation/wfc3/documentation/instrument-science-reports-isrs/_documents/2021/ISR_2021_12.pdf): The WFPC2 and WFC3 PSF Database (Dauphin et. al 2021)\n", + "\n", + "**WFPC2**\n", + "- [WFPC2 Instrument Handbook](https://www.stsci.edu/files/live/sites/www/files/home/hst/instrumentation/legacy/wfpc2/_documents/wfpc2_ihb.pdf)\n", + " - see Chapter 5: Point Spread Function for documentation on WFPC2's PSFs\n", + "- [WFPC2 Data Handbook](https://www.stsci.edu/files/live/sites/www/files/home/hst/instrumentation/legacy/wfpc2/_documents/wfpc2_dhb.pdf)\n", + "\n", + "**MAST**\n", + "- [MAST Website](https://archive.stsci.edu/)\n", + "- [MAST Portal](https://mast.stsci.edu/portal/Mashup/Clients/Mast/Portal.html)\n", + " - [MAST WFC3/WFPC2 PSF Field Descriptions](https://mast.stsci.edu/api/v0/_w_f_c3__p_s_ffields.html)\n", + "- [MAST API](https://mast.stsci.edu/api/v0/)\n", + " - [Services](https://mast.stsci.edu/api/v0/_services.html) (Examples exist for WFC3/UVIS and WFC3/IR databases)\n", + " - [Python Examples](https://mast.stsci.edu/api/v0/pyex.html) (Examples exist for WFC3/UVIS and WFC3/IR databases)\n", + " - As of August 2024, the MAST API for WFPC2 PSFs has not been documented, but the `service` is called `Mast.Catalogs.Filtered.Wfpc2Psf.Uvis`.\n", + "\n", + "\n", + "## About this Notebook\n", + "\n", + "**Author:** Fred Dauphin, WFC3 Instrument\n", + "\n", + "**Created On:** 2024-09-11\n", + "\n", + "**Updated On:** 2024-09-11\n", + "\n", + "**Source:** [HST Notebooks](https://github.com/spacetelescope/hst_notebooks)\n", + "\n", + "\n", + "## Citations\n", + "\n", + "If you use `numpy`, `matplotlib`, `astropy`, or `astroquery` for published research, please cite the\n", + "authors. Follow these links for more information about citing the libraries below:\n", + "\n", + "* [Citing `numpy`](https://numpy.org/citing-numpy/)\n", + "* [Citing `matplotlib`](https://matplotlib.org/stable/users/project/citing.html)\n", + "* [Citing `astropy`](https://www.astropy.org/acknowledging.html)\n", + "* [Citing `astroquery`](https://github.com/astropy/astroquery/blob/main/astroquery/CITATION)\n", + "***" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "[Top of Page](#top)\n", + "\"Space " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.8" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/notebooks/WFC3/mast_api_psf/mast_api_psf.py b/notebooks/WFC3/mast_api_psf/mast_api_psf.py new file mode 100644 index 000000000..b3433b9f0 --- /dev/null +++ b/notebooks/WFC3/mast_api_psf/mast_api_psf.py @@ -0,0 +1,402 @@ +""" +Name +---- +WFC3 and WFPC2 PSF MAST Database Module + +Purpose +------- +This module contains functions for querying sources and downloading +source cutouts from the WFC3 and WFPC2 PSF databases. + +Use +--- +This module is intended to be imported in a Jupyter notebook: + + >>> import mast_api_psf + +Author +------ +Fred Dauphin, July 2024 +""" + +import datetime +import multiprocessing +import os +import requests + +import tqdm +from astropy.io import fits +from astroquery.mast import Mast + +REQUEST_URL_PREFIX = 'https://mast.stsci.edu/api/v0.1/Download' + + +# Helper functions from https://mast.stsci.edu/api/v0/pyex.html +def set_filters(parameters): + """ + Some filtering queries require human-unfriendly syntax. + This allows you to enter your filter criteria as a dictionary, + which will then be parsed into correct format for searching. + """ + return [{"paramName": p, "values": v} for p, v in parameters.items()] + + +def set_min_max(min, max): + """ + Some parameters require minimum and maximum acceptable values: + for example, both RA and Dec must be given as a range. + This is a convenience function to format such a query correctly. + """ + return [{'min': min, 'max': max}] + + +# Downloading functions +def download_request_file(dataURI_filename): + """ + Performs a get request to download a specified file from the MAST server. + + This function is intended for downloading single cutouts. The load and + download limits for a single query are 50,000 and 500,000, respectively. + The file is intended to be downloaded as a .fits: + + Parameters + ---------- + dataURI_filename : list + The dataURI to be downloaded and the name of the downloaded fits file. + This is one parameter instead of two so a progress bar can be applied + to multiprocessing. + + Returns + ------- + filename : str + The name of the downloaded file. + """ + dataURI = dataURI_filename[0] + filename = dataURI_filename[1] + + # Specify download type + download_type = 'file' + request_url = f'{REQUEST_URL_PREFIX}/{download_type}' + + # Request payload + payload = {'uri': dataURI} + resp = requests.get(request_url, params=payload) + + # Write response to filename + with open(filename, 'wb') as FLE: + FLE.write(resp.content) + + return filename + + +def download_request_pool(dataURIs, cpu_count=0): + """ + Performs a get request to download a specified file from the MAST server. + + This function is intended for downloading multiple cutouts. The load and + download limits for a single query are 50,000 and 500,000, respectively. + This function is optimized by pooling and shows a progress bar. + + Parameters + ---------- + dataURIs : list + The dataURIs to be downloaded. + + cpu_count : int, default=0 + The number of cpus for multiprocessing. If 0, set to all available cpus. + + Returns + ------- + path_dir : str + The directory path to the downloaded cutouts. + """ + # Make PSF directory if necessary for downloads + now = datetime.datetime.now().strftime('MAST_%Y-%m-%dT%H%M') + if 'WFC3' in dataURIs[0]: + ins_psf = 'WFC3PSF' + else: + ins_psf = 'WFPC2PSF' + path_dir = f'{now}/{ins_psf}' + if not os.path.isdir(path_dir): + os.makedirs(path_dir) + + # Prepare arguments for pooling + filenames = [f'{path_dir}/{dataURI.split("/")[-1]}' for dataURI in dataURIs] + args = zip(dataURIs, filenames) + + # Pool using a progress bar + if cpu_count == 0: + cpu_count = os.cpu_count() + total = len(filenames) + pool = multiprocessing.Pool(processes=cpu_count) + _ = list(tqdm.tqdm(pool.imap(download_request_file, args), total=total)) + pool.close() + pool.join() + + return path_dir + + +def download_request_bundle(dataURIs, filename): + """ + Performs a get request to download a specified file from the MAST server. + + This function is intended for downloading multiple cutouts. The load and + download limits for a single query are 50,000 and 500,000, respectively. + The file downloaded is a .tar.gz: + + Parameters + ---------- + dataURIs : list + The dataURIs to be downloaded. + filename : str + The name of the downloaded '.tar.gz' file. + + Returns + ------- + filename : str + The name of the downloaded file. + """ + # Specify download type + download_type = 'bundle.tar.gz' + request_url = f'{REQUEST_URL_PREFIX}/{download_type}' + + # Request payload + payload = [("uri", dataURI) for dataURI in dataURIs] + resp = requests.post(request_url, data=payload) + + # Write response to filename + with open(filename, 'wb') as FLE: + FLE.write(resp.content) + + return filename + + +# Main functions +def mast_query_psf_database(detector, filts, columns=['*']): + """ + Query the WFC3/WFPC2 PSF databases on the MAST Portal using the MAST API. + + Both WFC3 channels (UVIS and IR) are accessible. + + The allowed columns (i.e. fields) are documented here: + https://mast.stsci.edu/api/v0/_w_f_c3__p_s_ffields.html + + Note: WFPC2's field for 'filter' (e.g. F606W) is called 'filter_1' so use + that accordingly. + + Parameters + ---------- + detector : str + The detector of the database to query. Allowed values are UVIS, IR, and + WFPC2. + filts : list of dicts + The filters applied to the query. Can be made using `set_filters`. + columns : list, default=['*'] + The columns to return for the query. If '*' is in `columns`, then all + columns are returned. + + Returns + ------- + obs : astropy.table.Table + A table of the queried sources' metadata with specific filters and + columns applied. + """ + # Check types + if not isinstance(detector, str): + raise TypeError('detector must be a string.') + if not isinstance(filts, list): + raise TypeError('filts must be a list.') + if not isinstance(columns, list): + raise TypeError('columns must be a list.') + + # Determine service for database + detector = detector.upper() + service_base = 'Mast.Catalogs.Filtered' + detector_databases = { + 'UVIS': 'Wfc3Psf.Uvis', + 'IR': 'Wfc3Psf.Ir', + 'WFPC2': 'Wfpc2Psf.Uvis' + } + try: + database = detector_databases[detector] + except KeyError: + valid_detectors = list(detector_databases.keys()) + raise ValueError(f'{detector} is not a valid detector. ' + f'Choose from {valid_detectors}.') + service = f'{service_base}.{database}' + + # If WFPC2, change filter to filter_1 + if detector == 'WFPC2': + if 'filter' in columns: + index = columns.index('filter') + columns[index] = 'filter_1' + for param in filts: + if 'filter' in param.values(): + param['paramName'] = 'filter_1' + + # Determine columns to query + if '*' in columns: + cols = '*' + else: + cols = ','.join(columns) + + # Set parameters and query database + params = {'columns': cols, + 'filters': filts} + obs = Mast.service_request(service, params) + + return obs + + +def make_dataURIs(obs, detector, file_suffix, unsat_size=51, sat_size=101): + """ + Make dataURIs for the WFC3 and WFPC2 PSF databases' sources. + + The dataURIs are URLs for downloading cutouts from the MAST Portal. + The cutouts are made using the package fitscut. + They can retrieve: + - raw data with suffixes 'raw' for WFC3 and 'd0m' for WFPC2. + - calibrated data with suffixes 'flt' for WFC3 and 'c0m' for WFPC2. + - charge transfer efficiency (CTE) corrected data with the suffix 'flc' + for UVIS. + + Parameters + ---------- + obs : astropy.table.Table + A table of the queried sources' metadata with specific filters and + columns applied. + detector : str + The detector of the queried sources. Allowed values are UVIS, IR, and + WFPC2. + file_suffix : list + The file suffixes to prepare for download. Allowed values are raw, d0m, + flt, c0m, and flc. + unsat_size : int, default=51 + The size for unsaturated (qfit>0;n_sat_pixels==0) cutouts. + sat_size : int, default=101 + The size for saturated (qfit==0;n_sat_pixels>0) cutouts. + + Returns + ------- + dataURIs : list + The dataURIs made from the queried sources. + """ + # Check type + if not isinstance(file_suffix, list): + raise TypeError('detector must be a list.') + + # Check suffixes (make sure there isn't a wrong suffix) + valid_suffixes = ['raw', 'd0m', 'flt', 'c0m', 'flc'] + for suffix in file_suffix: + if suffix not in valid_suffixes: + raise ValueError(f'{suffix} is not a valid suffix. ' + f'Choose from {valid_suffixes}.') + + # Determine database that was queried + detector = detector.upper() + wfc3_detectors = ['UVIS', 'IR'] + if detector in wfc3_detectors: + instrument = 'WFC3' + else: + instrument = 'WFPC2' + dataURI_base = f'mast:{instrument}PSF/url/cgi-bin/fitscut.cgi' + + # Loop through obs to make dataURIs + dataURIs = [] + for row in tqdm.tqdm(obs, total=len(obs)): + # Unpack values + iden = row['id'] + root = row['rootname'] + if detector == 'WFPC2': + filt = row['filter_1'] + else: + filt = row['filter'] + chip = row['chip'] + qfit = row['qfit'] + if qfit > 0: + size = unsat_size + else: + size = sat_size + subarray = row['subarray'] + + # If UVIS use chip to assign correct fits ext + if detector == 'UVIS': + if chip == '1' and subarray == 0: + fits_ext = 4 + else: + fits_ext = 1 + # Else chip is the correct fits ext + else: + fits_ext = chip + + # Make dataURIs for each suffix + for suffix in file_suffix: + if suffix in ['raw', 'd0m']: + coord_suffix = 'raw' + else: + coord_suffix = 'cal' + x = row[f'x_{coord_suffix}'] + y = row[f'y_{coord_suffix}'] + + file_read = f'red={root}_{suffix}[{fits_ext}]' + cutout = f'size={size}&x={x}&y={y}&format=fits' + file_save = f'{root}_{iden}_{filt}_{suffix}_cutout.fits' + dataURI = f'{dataURI_base}?{file_read}&{cutout}/{file_save}' + dataURIs.append(dataURI) + + return dataURIs + + +def convert_dataURIs_to_dataURLs(dataURIs): + """ + Convert dataURIs to URLs for the WFC3 and WFPC2 PSF databases' sources. + + Use the archive url, the hla folder, and the imagename parameter. + + Parameters + ---------- + dataURIs : list + The dataURIs made from the queried sources. + + Returns + ------- + dataURLs : list + The dataURLs for the queried sources. + """ + # Convert to dataURLs + dataURL_base = 'https://archive.stsci.edu/cgi-bin/hla' + dataURLs = [] + for dataURI in tqdm.tqdm(dataURIs, total=len(dataURIs)): + dataURL_split = dataURI.split('/') + file_cutout = f'{dataURL_split[3]}&imagename={dataURL_split[4]}' + dataURL = f'{dataURL_base}/{file_cutout}' + dataURLs.append(dataURL) + return dataURLs + + +def extract_cutouts_pool(dataURLs, cpu_count=0): + """ + Extract cutouts from dataURLs using multiprocessing. + + Parameters + ---------- + dataURIs : list + The dataURLs made from the queried sources. + cpu_count : int, default=0 + The number of cpus for multiprocessing. If 0, set to all available cpus. + + Returns + ------- + cutouts : list + The queried sources. + """ + # Pool using a progress bar + if cpu_count == 0: + cpu_count = os.cpu_count() + total = len(dataURLs) + pool = multiprocessing.Pool(processes=cpu_count) + cutouts = list(tqdm.tqdm(pool.imap(fits.getdata, dataURLs), total=total)) + pool.close() + pool.join() + + return cutouts diff --git a/notebooks/WFC3/mast_api_psf/requirements.txt b/notebooks/WFC3/mast_api_psf/requirements.txt new file mode 100644 index 000000000..2a66ff47f --- /dev/null +++ b/notebooks/WFC3/mast_api_psf/requirements.txt @@ -0,0 +1,5 @@ +astropy>=6.0.1 +astroquery>=0.4.7 +matplotlib>=3.8.4 +numpy>=1.23.5 +tqdm>=4.66.2 \ No newline at end of file diff --git a/notebooks/WFC3/point_spread_function.md b/notebooks/WFC3/point_spread_function.md index 775af7fcb..19dd46c47 100644 --- a/notebooks/WFC3/point_spread_function.md +++ b/notebooks/WFC3/point_spread_function.md @@ -1,12 +1,23 @@ -# HST WFC3 Point Spread Function Modeling -Here are brief descriptions of the WFC3 Notebooks for Point Spread Function modeling. +# Point Spread Function (PSF) +Here are brief descriptions of the WFC3 Notebooks for Point Spread Function +modeling. ## HST Point Spread Function -This notebook demonstrates how to generate Point Spread Function (PSF) +This notebook demonstrates how to generate PSF models for WFC3 observations. This includes retrieving empirical models -from the [WFC3 PSF Website](https://www.stsci.edu/hst/instrumentation/wfc3/data-analysis/psf), generating custom models by stacking stars, -and retrieving cutouts from the [MAST PSF Database](https://www.stsci.edu/hst/instrumentation/wfc3/data-analysis/psf/psf-search). The notebook -includes examples of how to generate stellar catalogs, perform PSF fitting -and subtraction, and how to utilize different types of PSF models depending -on the available data and science goals. While the examples are focused -on WFC3, the notebook can also be used with ACS, WFPC2, and other instruments. \ No newline at end of file +from the [WFC3 PSF Website](https://www.stsci.edu/hst/instrumentation/wfc3/data-analysis/psf), +generating custom models by stacking stars, and retrieving cutouts from the +[MAST PSF Database](https://www.stsci.edu/hst/instrumentation/wfc3/data-analysis/psf/psf-search). +The notebook includes examples of how to generate stellar catalogs, perform PSF +fitting and subtraction, and how to utilize different types of PSF models +depending on the available data and science goals. While the examples are +focused on WFC3, the notebook can also be used with ACS, WFPC2, and other +instruments. + +## Downloading WFC3 and WFPC2 PSF Cutouts from MAST +The WFC3 team annually releases PSF cutouts (i.e. realizations of the PSF) of +sources detected from [HST1PASS](https://www.stsci.edu/files/live/sites/www/files/home/hst/instrumentation/wfc3/documentation/instrument-science-reports-isrs/_documents/2022/WFC3-ISR-2022-05.pdf) +in WFC3 and WFPC2 observations. These PSF databases contain over 83 million +unsaturated and saturated sources, and are publicly available on the +[MAST Portal](https://mast.stsci.edu/portal/Mashup/Clients/Mast/Portal.html). +Here, we present a custom MAST API to query, download, and extract PSF cutouts. \ No newline at end of file