Supplementary Catalogs

The Reconstruction Pipeline

Download

Click a file to download it with your browser, or copy its link and download with wget --content-disposition <THE_LINK>.

Specification

The geometry file "geometry.hdf5" contains a 3-D mask array that indicates the reconstruction volume.

Code Samples

To find out whether or not a given object is in the reconstruction volume, we load the mask array from the geometry file. The mask array is a floating point array whose values are either 0.0 or 1.0. We turn them into a Boolean array:

with h5py.File('supplementary/reconstruction-pipeline/geometry.hdf5', 'r') as f:
    mask = f['Masks/SDSS_reconstruction_area'][()] > 0.99
mask.shape, mask.dtype
Output:
((500, 500, 500), dtype('bool'))
The flagship run has a box size \(L_{\rm box} = 500\, h^{-1}{\rm Mpc}\). Hence, the \(500^3\) mask array gives a grid spacing $l_{\rm grid}=1\,h^{-1}{\rm Mpc}$. To tell whether or not a object is in the reconstruction volume, we first convert its coordinate to the grid index, and then check the mask value at that index. For example, in the following code sample, we check two points, one is the position of Coma cluster, and the other is the origin of the box:
l_grid = 500 / mask.shape[0]                # grid size in Mpc/h
x = np.array([
    (332.6301, 318.91406, 63.530796),       # position of Coma
    (0., 0., 0.)                            # position of box origin
])
idx_grid = (x / l_grid + 0.5).astype(int)
mask[tuple(idx_grid.T)]
Output:
array([ True, False])
It is seen that Coma is indeed contained by the reconstruction, while the box origin is not.

Extended Merger Trees

Download

To download the full tree catalogs, select a catalog and use the shell command to download it. Replace <API_KEY> with your API key.

Download with the command:

Specification

Please refer to Yangyao Chen, et al. 2023 for the details of the extension algorithm.

To facilitate the common loading pattern of simulation data, the extended version of subhalo merger trees is organized into 125 self-consistent, individual files, each of which contains the trees in a subbox with a side length $L_{\rm subbox} = 100 h^{-1}{\rm Mpc}$. This is done by partitioning the whole simulation box (of side length $L_{\rm box} = 500 h^{-1}{\rm Mpc}$) into $5 \times 5 \times 5$ equal-size subboxes, and assign each subhalo merger tree to a subbox according to the location of the "most massive root subhalo". The extension algorithm is then applied to each subbox, producing 125 files containing the extended version of trees. The subbox is indexed in a row-major manner. For example, the subbox with subbox_id=3 covers the spatial range $0 \leqslant x < 100$, $0 \leqslant y < 100$ and $300 \leqslant z < 400$ (in $h^{-1}{\rm Mpc}$).

Available fields for each subhalo are listed below.

DatasetDatatype, Shape
Unit
Description
subhalo_id int64, $n_{\rm subhalo,subbox}$
The index of this subhalo, ranging in $[0, n_{\rm subhalo,subbox})$, where $n_{\rm subhalo,subbox}$ is the total number of subhalos in the subbox. Note that this value is unique in, and only in, a given subbox, which means two different subhalos in two subboxes may get the same index.
src_subhalo_id int64, $n_{\rm subhalo,subbox}$
The index of this subhalo in the original (no extension) file. Set to -1 for an extended subhalo.
src_flag int32, $n_{\rm subhalo,subbox}$
A flag indicating how the physical properties of the subhalo are assigned, 0 for "simulated", 1 for "extended" and 2 for "replaced".
snap int32, $n_{\rm subhalo,subbox}$
Snapshot number of this subhalo.
is_kept bool, $n_{\rm subhalo,subbox}$
"true" if the subhalo is a simulated subhalo and its "x_ext" and "v_ext" are kept as its simulated values "x" and "v".
f_pro int64, $n_{\rm subhalo,subbox}$
Index (subhalo_id) of the most massive progenitor subhalo. -1 if none.
n_pro int64, $n_{\rm subhalo,subbox}$
Index (subhalo_id) of the next most massive progenitor subhalo that shares the same descendant. -1 if none.
des int64, $n_{\rm subhalo,subbox}$
Index (subhalo_id) of the descendant subhalo. -1 if none.
main_leaf int64, $n_{\rm subhalo,subbox}$
Index (subhalo_id) of the main leaf subhalo of this subhalo, i.e., the main branch subhalo at highest redshift.
last_pro int64, $n_{\rm subhalo,subbox}$
Index (subhalo_id) of the last progenitor subhalo, i.e., the last subhalo (in DFS order) in the subtree rooted at this subhalo.
f_in_grp int64, $n_{\rm subhalo,subbox}$
Index (subhalo_id) of the most massive subhalo in the same FoF group.
n_in_grp int64, $n_{\rm subhalo,subbox}$
Index (subhalo_id) of the next most massive subhalo in the same FoF group. -1 if none.
m_crit_200 float32, $n_{\rm subhalo,subbox}$
$10^{10}\,h^{-1}M_\odot$
Total mass of the FoF group enclosed in a sphere whose mean density is 200 times the critical density of the Universe of that time. This field is only significant for a central subhalo and set arbitrarily for a satellite subhalo.
m_mean_200 float32, $n_{\rm subhalo,subbox}$
$10^{10}\,h^{-1}M_\odot$
Total mass of the FoF group enclosed in a sphere whose mean density is 200 times the mean density of the Universe of that time. This field is only significant for a central subhalo and set arbitrarily for a satellite subhalo.
m_tophat float32, $n_{\rm subhalo,subbox}$
$10^{10}\,h^{-1}M_\odot$
Total mass of the FoF group enclosed in a sphere whose mean density is $\Delta_c$ times the critical density of the Universe of that time, according to the spherical collapse model of Bryan+1998. This field is only significant for a central subhalo and set arbitrarily for a satellite subhalo.
r_tophat float32, $n_{\rm subhalo,subbox}$
$h^{-1}{\rm Mpc}$
Comoving radius corresponding to m_tophat. This field is only significant for a central subhalo and set arbitrarily for a satellite subhalo.
m_sub float32, $n_{\rm subhalo,subbox}$
$10^{10}\,h^{-1}M_\odot$
Subhalo mass, i.e., sum of masses of particles bound to this subhalo.
v_max float32, $n_{\rm subhalo,subbox}$
$\rm km/s$
Maximal value of the circular velocity, $\sqrt{\frac{G M_{< r}}{r}}$, where $r$ is the physical distance to the minimal potential of the subhalo, $M_{< r}$ is the total mass enclosed in a sphere of radius $r$. Only particles linked to this subhalo by Subfind are included in the computation.
sub_half_mass float32, $n_{\rm subhalo,subbox}$
$h^{-1}\,{\rm Mpc}$
Physical radius (relative to the minimal potential) that encloses half of the bound mass of the subhalo.
spin float32, ($n_{\rm subhalo,subbox}$, 3)
$h^{-1}{\rm Mpc\,km/s}$
Averaged physical specific angular momentum of particles per axis, computed as the average of relative position vector cross relative velocity vector of all linked particles. Position and velocity vectors are computed relative to the central of mass of the subhalo, both are physical values.
vel_disp float32, $n_{\rm subhalo,subbox}$
$\rm km/s$
1-D physical velocity dispersion of all linked particles (i.e., 3-D dispersion divided by $\sqrt{3}$). Hubble flow is included.
len int32, $n_{\rm subhalo,subbox}$
Number of particles linked to this subhalo.
most_bound_pid int64, $n_{\rm subhalo,subbox}$
pid of the most bound particle linked to this subhalo.
x float32, ($n_{\rm subhalo,subbox}$, 3)
$h^{-1}\,{\rm Mpc}$
Simulated comoving spatial position in the periodic box, defined as the position of its most bound particle. may be slightly out of $[0, L_{\rm box})$. For an extended subhalo, set to -1.0e6.
v float32, ($n_{\rm subhalo,subbox}$, 3)
$\rm km/s$
Simulated peculiar velocity of the subhalo, defined as the averaged velocities of all particles linked to it, with a $\sqrt{a}$ multiplied. For an extended subhalo, set to -1.0e6.
x_ext float32, ($n_{\rm subhalo,subbox}$, 3)
$h^{-1}\,{\rm Mpc}$
Similar to x, but an extended subhalo is also assigned with a meaningful value.
v_ext float32, ($n_{\rm subhalo,subbox}$, 3)
$\rm km/s$
Similar to v, but an extended subhalo is also assigned with a meaningful value.

In a given subbox, trees are stored one after one, and subhalos in a given tree are stored in depth-first searching (DFS) order. That means the first progenitor of a subhalo is stored next to it. Thus, by using the main_leaf index, the main branch of a subhalo can be easily selected by a Python "slice".

Additional Catalogs

Extended subhalos at given redshift: We also provide catalogs of subhalos at given redshifts. For example, the catalog at $z=0$ (snapshot 99) for subbox "15" is obtained by selecting all subhalos with snap = 99 in the file of that subbox, and outputing all fields of the selected subhalos into a new file named "subhalos.snap99.subbox15.hdf5". Note that in the output file, subhalo_id is no longer contiguous, and hence the look-up of, e.g. f_in_grp, n_in_grp, etc., has to be performed with a hash table (e.g. Python dict).

Assembly histories: For the subhalos at a given redshift, we also provide their history information. This is recorded in a separate file named "history.snap{snap_id}.hdf5". Subhalos in a given subbox are stored in a data group named "Subboxes/{subbox_id}" under that file. The number of subhalos and their storage order are exactly the same as those in the file "subhalos.snap{snap_id}.subbox{subbox_id}.hdf5" The following fields are available for each subhalo:

DatasetDatatype, Shape
Unit
Description
subhalo_id int64, $n_{\rm subhalo,subbox,snap}$
The same as the value in the tree file for a given subhalo. $n_{\rm subhalo,subbox,snap}$ is the total number of subhalos in the subbox at this snapshot.
m_peak float32, $n_{\rm subhalo,subbox,snap}$
$10^{10}\,h^{-1}M_\odot$
The peak value of m_sub in the evolution history (i.e. main branch).
v_peak float32, $n_{\rm subhalo,subbox,snap}$
$\rm km/s$
The peak value of v_max in the evolution history (i.e. main branch).

Subhalos matching to hydrodynamical simulations: The extension algorithm matches subhalos in the target, low-resolution simulation (S_targ, ELUCID here, with extension) to those in a reference, high-resolution, dark-matter-only simulation (S_ref_dmo, TNG100-1-Dark here) to get assigned physical quantities and positions. In addition, a hydrodynamical simulation with baryonic physics (S_ref_baryonic, TNG100-1 here) can be run under the same initial conditions as S_ref_dmo and populate subhalos in S_ref_dmo with galaxy properties. The combination of these two procedures lights up each subhalo in S_targ with a galaxy modeled by S_ref_baryonic.

We provide a catalog for all subhalos in S_targ with matched indices in S_ref_dmo and S_ref_baryonic. This relies on the matching of subhalos in S_ref_dmo and S_ref_baryonic (see TNG site, the LHaloTree variant). A small number of low-mass subhalos do not have baryonic matching, whose matched indices are set to -1. The file named "tng_100_1_matched.snap{snap_id}.hdf5" have data groups named "Subboxes/{subbox_id}" for each subbox of ELUCID, and under each data group the follow datasets are stored:

DatasetDatatype, Shape
Unit
Description
subhalo_id int64, $n_{\rm subhalo,subbox,snap}$
subhalo_id in the extended tree catalog .
raw_chunk_id int32, $n_{\rm subhalo,subbox,snap}$
chunk number, in the range of $[0, {\rm N_{chunks}})$, in the raw (no extension) ELUCID subhalo catalog . -1 if this is an extended subhalo.
raw_subfind_id int32, $n_{\rm subhalo,subbox,snap}$
Index (i.e., array offset, namely Subfind ID) of this subhalo in the raw (no extension) ELUCID subhalo catalog at this snapshot. -1 if this is an extended subhalo.
matched_baryonic_subfind_id int64, $n_{\rm subhalo,subbox,snap}$
Index (i.e., array offset, namely Subfind ID) of the matched subhalo in the reference baryonic simulation. -1 if no match found.
matched_dmo_subfind_id int64, $n_{\rm subhalo,subbox,snap}$
Index (i.e., array offset, namely Subfind ID) of the matched subhalo in the reference dark-matter-only simulation.
matched_dmo_subhalo_id int64, $n_{\rm subhalo,subbox,snap}$
Subhalo ID (i.e., SubLink ID) of the matched subhalo in the tree catalog of the reference dark-matter-only simulation.
is_central bool, $n_{\rm subhalo,subbox,snap}$
True if this is a central subhalo (subhalo_id == f_in_grp ).
is_kept bool, $n_{\rm subhalo,subbox,snap}$
See the extended tree catalog
is_simulated bool, $n_{\rm subhalo,subbox,snap}$
True if this is a simulated subhalo (i.e., not created by the extension).
m_tophat float32, $n_{\rm subhalo,subbox,snap}$
$10^{10}\,h^{-1}M_\odot$
See the extended tree catalog . Here all subhalos in the same FoF halo share the same value.
v_max float32, $n_{\rm subhalo,subbox,snap}$
$\rm km/s$
See the extended tree catalog .
x float32, $(n_{\rm subhalo,subbox,snap},3)$
See the extended tree catalog .
x_ext float32, $(n_{\rm subhalo,subbox,snap},3)$
See the extended tree catalog .
x_j2k float64, $(n_{\rm subhalo,subbox,snap},3)$
R.A., Dec. and redshift in the J2000 observational system converted from x. For an extended subhalo, set to -1.0e6.
x_j2k_ext float64, $(n_{\rm subhalo,subbox,snap},3)$
R.A., Dec. and redshift in the J2000 observational system converted from x_ext.

Code Samples

See the Jupyter notebook extended_subhalo_samples.ipynb.

Semi-Analytical Models

Specification

Semi-analytical galaxies are available. These galaxies are modeled by following the extended (subhalo) merger tress produced by the DMO runs. Each file named sams.snap{snap_id}.hdf5 records the galaxies in one snapshot.

Available fields for each galaxy are listed below, where $n_{\rm g, snap}$ denotes the total number of galaxies at a snapshot.

DatasetDatatype, Shape
Unit
Description
subbox_id int64, $n_{\rm g,snap}$
The index of the subbox that the merger tree resides.
in_fast_acc bool, $n_{\rm g,snap}$
True if the subhalo is in fast accretion.
is_sat bool, $n_{\rm g,snap}$
True if the subhalo is a satellite subhalo.
m_s float64, $n_{\rm g,snap}$
$10^{10}\,h^{-1}M_\odot$
Stellar mass.
m_s_b float64, $n_{\rm g,snap}$
$10^{10}\,h^{-1}M_\odot$
Bulge stellar mass.
m_s_d float64, $n_{\rm g,snap}$
$10^{10}\,h^{-1}M_\odot$
Disk stellar mass.
sfr float64, $n_{\rm g,snap}$
$10^{10}\,M_\odot/{\rm Gyr}$
Star formation rate.
m_bh float64, $n_{\rm g,snap}$
$10^{10}\,h^{-1}M_\odot$
Central SMBH mass.
dm_bh float64, $n_{\rm g,snap}$
$10^{10}\,h^{-1}M_\odot$
The change of central SMBH mass since the last snapshot. This can be used to evaluate, e.g. SMBH accretion rate.
m_g float64, $n_{\rm g,snap}$
$10^{10}\,h^{-1}M_\odot$
Total gas mass within the galaxy.

Other fields, such as x_ext, v_ext, m_crit200, m_tophat, v_max, ..., are also dumped, with their values taken from the host subhalos (see the list of extended merger tress).

Code Samples

See the python scripts example.py.