Materials for the photometric redshift (PZ) project in the MACSS.
You can find accompanying lecture notes here.
These instructions assume that you already have a conda installation on your computer.
If you need to install conda, you can find instructions online, e.g., here.
Open a terminal and do:
conda create -n macss -y python=3.12
conda activate macss
git clone https://github.com/KIPAC/MACSS.git
cd MACSS
pip install -e ".[dev]"
Test the installation and download the data for the examples:
py.test
This will download the data, create a directory in your home directory called macss
and put the data there.
You should check to make sure that the data have been succesfully downloaded.
If the data did not download, you can download them by hand from https://s3df.slac.stanford.edu/people/echarles/xfer/macss.tgz.
To start a jupyter session in the MACSS/nb directory that you can use to run notebooks.
conda activate macss # if you have not already done so in that shell
jupyter-notebook nb
This project is quite open-ended, but I’d like to at least see everyone:
Time permitting, I encourage you to:
More general information about the project
I encourage you to start a presentation or a google slide deck to take notes as you work. This will make it much easier to remember what you did and to write up your work for presentation. You can put figures that you make along the way directly into that document.
I also encourage you to have a quick look at the src/macss
area in the MACSS
github project:
https://github.com/KIPAC/MACSS/tree/main/src/macss.
That area contains a lot of functions that are used in the various examples. It is useful for you to write your own versions of some of those functions to get practice doing this, but if you get stuck, you can always look at those functions for help.
Goals: explore the Rubin catalog data provided, and write some functions that will prepare catalog data for later analysis.
Specifics: The rubin catalog data provides several different object fluxes, rather than magnitudes, and is does not account for the redenning cause by galactic dust. You will need to:
More information about part 1 of the project
Goals: explore the Rubin photometric reference data provided, and understand some of the details, such as the limiting depth, the photometric uncertainties.
Specifics: I have provided you with some prepared photometric reference data, which includes cross-matched objects with known redshifts. You will want to:
More information about part 2 of the project
Goals: create a photometric redshift estimator using the scikit-learn tool-kit and test it out
Specifics: I have provided you with some prepared photometric reference data, which includes cross-matched objects with known redshifts. You will want to:
More information about part 3 of the project
Goals: investigate if you can improve your regression model using additional fluxes or morphology information
Specifics: The data that I provided contain some additional information that might be useful in redshift estimation, such as the sizes of the galaxies, and the different flux measurements are sensitive to light from different parts of the galaxies.
You will need to:
More information about part 4 of the project
Goals: characterize how sensitive the model is to imperfect input data
Specifics: The test and training data is somewhat idealized, in that we have selected galaxies that have well measured redshifts. These galaxies tend to be a brighter, and spatially isolated from other galaxies.
You will need to:
More information about part 5 of the project
Goals: go from per-object estimates to estimates of the distribution of p(z) for an ensemble of objects.
Specifics: Up to this point we have been worrying about the redshifts of individual objects. There are a lot of times what we care about is actually the distribution of the redshifts of a great many objects. We typically call these n(z) distributions, as opposed to per-object p(z).
You will want to: