Data access & catalog format
Layout
The catalogs are on the m4943 allocation, one HDF5 file per realization at each redshift:
/global/cfs/cdirs/m4943/covariance_mocks/
v1/
README.md
catalogs/
z1.400/ r3000.hdf5 … r4999.hdf5
z1.700/ …
metadata/
manifest_z1.400.csv
provenance_z1.400.json
catalogs/z<redshift>/r<NNNN>.hdf5 is realization NNNN at that redshift.
z=1.4 is available now (1878 realizations); other redshifts are added as they
complete.
Catalog format
Each file has a galaxies/ group of per-object columns (all length n_galaxies):
Column |
Units |
Description |
|---|---|---|
|
Msun/yr |
Calibrated star-formation rate. |
|
Msun/yr |
Raw star-formation rate. |
|
Msun |
Calibrated stellar mass. |
|
Msun |
Raw stellar mass. |
|
Msun |
Halo peak mass. |
|
Mpc/h |
Comoving position, shape |
|
as stored |
Peculiar velocity, shape |
Box-level attributes include Lbox (= 500 Mpc/h), z_obs (the redshift),
n_galaxies, phase (e.g. ph3000), and simulation_box
(AbacusSummit_small_c000).
Reading a catalog
from covariance_mocks.selection import Catalog
path = "/global/cfs/cdirs/m4943/covariance_mocks/v1/catalogs/z1.400/r3000.hdf5"
with Catalog.open(path) as cat:
cat.redshift # 1.4
cat.Lbox # 500.0
cat.volume # Lbox**3 in (Mpc/h)^3
len(cat) # n_galaxies
cat.available() # column names
cat.column("sfr_corr") # per-object array
cat.column("x") # pos[:, 0]
The ensemble n(>SFR) table
NumberDensity with an ensemble threshold uses the
ensemble-averaged cumulative density n(>sfr_corr). Build it from a set of catalogs
with build_ensemble_nsfr() (see Quickstart).
Metadata
metadata/manifest_z<redshift>.csv lists each staged realization with its byte size
and source. metadata/provenance_z<redshift>.json records the source path,
simulation, data model, and realization count.