This page contains some of the most important overview content from the public data release paper.
Illustris is a suite of large volume, cosmological hydrodynamical simulations run with the moving-mesh code Arepo and including a comprehensive set of physical models critical for following the formation and evolution of galaxies across cosmic time. Each simulates a volume of $(106.5 {\rm Mpc})^3$ and self-consistently evolves five different types of resolution elements from a starting redshift of $z=127$ to the present day, $z=0$. These components are: dark matter particles, gas cells, passive gas tracers, stars and stellar wind particles, and supermassive black holes.
This data release includes the snapshots at all 136 available redshifts, halo and subhalo catalogs at each snapshot, and two distinct merger trees. Six primary realizations of the Illustris volume are released, including the flagship Illustris-1 run. These realizations include three resolution levels with the fiducial "full" baryonic physics model, and a dark matter only analog for each. In addition, there are four distinct, high time resolution, smaller volume "subboxes".
Caption. The most important numerical parameters for the six full volume runs. Gravitational softenings for all particle types other than DM are comoving kpc (with value equal to that of the DM) until $z=1$ after which they are fixed to their $z=1$ values, such that at $z=0$ they have half the softening length as the DM. $ m_{\rm baryon} $ is the "target gas mass" (i.e. only the mean mass). The number of gas cells equals the $N_{\rm GAS}$ value only in the initial conditions, the number will then drop as stars and black holes form. Moreover, the total number of baryonic particles (gas cells + star particles + wind particles + black holes) is also not conserved since gas cells can be refined/de-refined to keep their mass within a factor of 2 around $m_{\rm baryon}$. In contrast, the total number of tracers and dark matter particles are both conserved for the duration of the simulation.
In the table above we provide an overview of the specifications of the six Illustris runs, including the computational volume, gravitational softening lengths, and masses of the different particle/cell types, which collectively indicate the resolution and dynamic range achieved. To emphasize the variety of galaxy formation and evolution phenomena which can be addressed with the Illustris simulations, in the following figure we give the approximate number of a selection of interesting astrophysical objects that can be found in the simulated box, from dark-matter dominated halos at $z=0$ to luminous active galactic nuclei (AGN) at higher redshifts.
Caption. Overview of the variety of galaxy formation and evolution phenomena accessible in the Illustris simulations. A few classes of interesting objects are listed for each of the four mass components present in the simulation: dark matter, stars, gas, and black holes. These are visualized on the left column, for different volumes and spatial scales, as dark-matter density, stellar light, gas density and gas temperature maps, with black holes denoted as black dots. The approximate number present in the Illustris-1 volume is given (from bottom to top), for a) galaxy clusters at $z=0$ with total mass $M_{200c}> 10^{14} {\rm M}_\odot$; b) Milky Way-like halos at $z=0$ ($ 6 \times 10^{11} < M_{200c}< 2 \times 10^{12} {\rm M}_\odot$); c) gravitationally-bound objects (dark or luminous) resolved with more than a thousand particles at the end of the reionization epoch; d) galaxies at $z=0$ with stellar mass exceeding $10^{10} {\rm M}_\odot$, including both centrals and satellites, from elliptical to disk morphologies; e) satellite galaxies at $z=0$ more massive than the Large Magellanic Cloud (stellar mass $> 1.5 \times 10^9 {\rm M}_\odot$), in any mass host; f) massive, compact galaxies at $z=2$ according to the selection of Barro et al. (2013); g) clusters of galaxies at $z=0$ emitting in the X-rays with luminosity exceeding $10^{42}$ erg/s; h) sources at $z=0$ with neutral hydrogen mass exceeding $5 \times 10^8 {\rm M}_\odot$; i) $10^{12} {\rm M}_\odot$ halos at $z=3$ with at least a damped Lyman-alpha system (HI column density $> 10^{20.3} {\rm cm}^{-2}$) within $50 {\rm kpc}$; j) black holes at $z=0$ more massive than $10^9 {\rm M}_\odot$; k) black-hole merger remnants at $z=0$ , i.e. sub grid black-hole binaries with $M_{\rm BH} > 10^6 {\rm M}_\odot$ for each BH and 1 Gyr delay between the simulation BH merger time and the actual BH merger; l) AGNs at $z=1$ with bolometric luminosity greater than $10^{45}$ erg/s.
A series of analyses based on the Illustris suite have already been performed. These include 1) comparisons to observations and studies of the impact of different feedback models on the distribution and content of gas on large scales, within halos and in the circumgalactic regime; 2) characterizations of the properties of galactic stellar halos, of the satellite populations across host masses, of the star formation histories and of the morphologies and angular-momentum build up of Illustris galaxies; 3) applications of shock finder algorithms; 4) analyses on the formation of massive, compact galaxies at high redshifts; 5) quantification of the galaxy merger rates, and 6) applications of post-processing radiative transfer algorithms in the study of cosmic reionization. See the up to date List of Results for references.
All of the "full physics: Illustris runs contain the following physical components:
For complete details on the behavior, implementation, parameter selection, and validation of these physical models, see Vogelsberger+ (2013) which describes the feedback models, and Torrey+ (2014), which compares the model output with observations from $z=0$ to $z=3$.
The Illustris simulations employ the Arepo code which evolves the equations of continuum hydrodynamics coupled with self-gravity. The spatial discretization of the fluid is provided by an unstructured, moving, Voronoi tessellation. On the volumes defined by individual cells Godunov's method is employed, with a directionally unsplit MUSCL-Hancock scheme and an exact Riemann solver. The Voronoi mesh is generated from a set of control points which move with the local fluid velocity modulo mesh regularization corrections. Gravitational forces are computed using the Tree-PM approach, with long-range forces calculated with a Fourier particle-mesh method, and short-range forces with a hierarchical tree algorithm. The code is second order in space, and with hierarchical adaptive time-stepping, also second order in time. During the simulation we employ a Monte Carlo tracer particle scheme (see Genel+ 2013) to follow the Lagrangian evolution of baryons.
In terms of both physical models and numerical methods, the Illustris simulations rely on a substantial foundation of previous work. In the following figure we provide an abridged reference tree covering both the physical models and numerical methods. The papers along any given branch are essential for understanding the details and limitations of the data.
Caption. Reference tree for the major components of Illustris, including both numerical methods and physical models. Each paper links to its arXiv or ADS entry (only if viewed at full size). We generally include both models and methods which were directly implemented in Illustris, while entries in the dark subboxes indicate model data inputs.
There are two complementary ways to access the Illustris data products.
These two approaches can be combined. For example, you may be forced to download the full redshift zero group catalog in order to perform a complex search not supported by the API. After locally determining a sample of interesting galaxies, you could then extract their individual merger trees (and/or raw particle data) without needing to download the full simulation merger tree (or a full snapshot).
All of the primary data products for Illustris are released in HDF5 format. This is a portable, self-describing, binary specification suitable for large numerical datasets, for which file access routines are available in all common computing languages. We use only the basic features of the format: groups, attributes, and datasets, with one and two dimensional numeric arrays.
In order to maintain reasonable filesizes, most outputs are split across multiple file "pieces" (or "chunks").
For example, each snapshot of Illustris-1 is split into 512 sequentially numbered files.
Individual links to each file chunk are available by selecting a particular simulation on the
main data page. Pre-computed sha256
checksums are provided for all files
so that their integrity can be verified.
For a getting-started guide and reference see the Example Scripts Documentation (in IDL, Python, and Matlab).
We have implemented a web-based interface (API) which can respond to a variety of user requests and queries. It is a well-defined interface between the user and the Illustris data products, which is expressed in terms of the required input(s) and expected output(s) for each type of request. The provided functionality is independent, as much as possible, from the underlying data structure, heterogeneity, format, and access methods. The API can be used in addition to, or in place of, the download and local analysis of large data files. At a high level, the API allows a user to search, extract, visualize, and analyze. In each case, the goal is to reduce the data response size, either by extracting an unmodified subset, or by calculating a derivative quantity.
By specific example, the following types of requests can be handled through the current API, for any simulation at any snapshot:
For a getting-started guide, cookbook of examples, and API reference see the Web API Documentation (in IDL, Python, and Matlab).
Subhalo Search Form: We provide a simple search form through which users can query the subhalo database. The search capabilities that exist in the API are exposed in a more human-friendly interface, to enable exploration without the need to write code or write URLs by hand. For example, objects can be selected based on total mass, stellar mass, star formation rate, gas metallicity, or size. The output is a familiar spreadsheet type format, which lists properties from the group catalogs. In addition, each subhalo row provides links to a common set of web-based tools for introspection. These include a full listing of all catalog fields, a form for selecting particle types and initiating an extraction of particles from the snapshot, merger tree visualization, and links to pre-rendered images, when available.
Explorer: The Illustris Explorer is an experiment in the visualization, exploration, and dissemination of large data sets -- in particular, those generated by large, astrophysical simulations such as Illustris. It uses the approach of thin-client interaction with derived data products, in this case, pre-computed imagery layered under group catalog information. Rapid search over group properties spatially overlays the results on top of the pre-rendered images. All mass components of the simulation are present: the continuous gas and dark matter fields, stellar light from individual stars, and black holes. We have found the interface particularly useful in exploring the spatial relationships between these four components and the discrete halos and subhalos identified with substructure finding algorithms.
Merger Tree Tool: As a demonstration of the potential of rich client applications built on top of the Illustris API, the above figure shows a snapshot of the current available interface for interactively exploring the merger trees. A zoomed-in portion of the SubLink tree for the 500th most massive central subhalo of Illustris-1 at z=0 is shown. The tree is vector based, and client side, so each node can be interacted with individually. The informational popup provides a link, back into the API, where the details of the selected progenitor subhalo can be interrogated.
The Illustris Simulations (particularly Illustris-1) have been shown to resolve many details of the small-scale properties of galaxies, as well as the evolution of stars and gas within the cosmic web. Illustris-1 reproduces many observational facts on the demographics and properties of the galaxy populations at various epochs, and on the distribution of gas on large scales. This has been achieved with a comprehensive galaxy formation model which is intended to account for all the primary processes that are believed to be important for the formation and evolution of galaxies.
However, the enormous dynamical range and the variety and complexity of physics phenomena involved in these numerical endeavours necessarily involve some modeling uncertainties. We have identified below the known problems and points of caution in the Illustris simulated output that any user of the public data must be aware of before embarking on the analysis of the released products. These points should be carefully taken into account before advancing scientific conclusions or making comparisons to observational results.
Limitations in the Illustris implementations of the stellar and AGN feedback, and possibly of the adopted star-formation recipe, determine a series of issues in the simulated galaxy populations and gas content of halos in comparison to observational constraints. These all point to an inefficient quenching of the star formation in galaxies at different masses and regimes, and in some cases also to qualitatively not-realistic behaviors of the feedback models. In particular, the following issues applicable to the highest-resolution realization (Illustris-1) must be noted:
For some items of this list we have intentionally omitted more specific quantifications of the tensions with observations for two reasons: on the one side, not all observational results are in agreement among each other, making quantitative statements necessarily partial; on the other side, excruciating care is necessary to properly map simulated variables into observationally-derived quantities.
For example, we notice that the adopted low star-formation density threshold value and the low thermal energy content of galactic winds may be the cause for spurious star-formation in the circumgalactic medium around Milky Way-like galaxies, at large distances from the natural, dense sites of star formation activity (i.e. disks, see \cite{marinacci14a}). However, no observational data are available to properly quantify such phenomenon. Similarly, the impact of the AGN feedback on the dark-matter distribution within Illustris halos might be overestimated, but direct observational constraints are lacking.
Furthermore, while a first analysis of the stellar ages of Illustris galaxies seemed to reveal an overestimation of the predicted stellar ages for $M_\star \lesssim 10^{10.5} {\rm M}_\odot$ galaxies (see Fig. 25 of Vogelsberger+ 2014b), we have now recognized that such a comparison to observations is rather inconclusive, as the shape of the age-mass relation of galaxies strongly depends, in the first place, on whether stellar ages are measured by mass- or light- weighting.
To better inform which features of the simulations should be trusted when making science conclusions, note also the following points more directly related to numerical choices:
To support proper attribution, recognize the effort of individuals involved, and monitor ongoing usage and impact, we request the following:
Any publication making use of data from the Illustris simulations should cite the release paper (Nelson et al. 2015c) as well as the original paper introducing the project (Vogelsberger et al. 2014a). Furthermore, extensive use of the data, or studies of galaxy properties and populations, should cite if appropriate Vogelsberger et al. (2014b) as well as Genel et al. (2014). Any investigation of the black hole population should cite if appropriate Sijacki et al. (2015).
Finally, use of any of the supplementary data products should include the relevant citation. A full and up to date list will be maintained here:
The full snapshots of Illustris-1 are sufficiently large that it will be prohibitive for most users to acquire or store a large number. We note that transferring 1.5 TB (per snapshot) at 10 MB/s will take roughly 42 hours. As a result, projects which require access to the entire snapshot set may benefit from closer interaction with members of the Illustris collaboration. In particular, many team members are open to more direct collaboration, which can include guest access to compute resources which are local to full copies of the data. We welcome ideas for joint projects, so long as they intersect with the interests of collaboration members and do not overlap with existing efforts. We suggest, practically, to contact the author(s) who have already published work using Illustris data in related scientific topics.
We also welcome contributions to the data release. These can take the form of either analysis code, or computed data products:
We anticipate the ongoing release of additional data products, for which further documentation will be provided online:
Rockstar and Consistent-Trees. We plan to release Rockstar group catalogs and the Consistent-Trees merger trees built upon them for the six Illustris boxes in the near future, and will provide further documentation at that time. These group catalogs can include a different subhalo population than identified with the Subfind algorithm, particularly during mergers. The algorithm used to construct the C-Trees also has fundamental differences to both LHaloTree and SubLink. This can provide a powerful comparison and consistency check for any scientific analysis. We also anticipate that some users will simply be more familiar with these outputs, or need them as inputs to other tools.Additional Supplementary Data Catalogs:
Additional Simulations: Several smaller simulations related to Illustris have been discussed in previous papers, including a series of $25 {\rm Mpc}/h$ boxes with variations on the input feedback parameters. These can be released in the future if there is community interest. Ongoing and future projects, including higher resolution zooms of individual systems, as well as larger volumes, will also be released through this platform in the future.
API Functionality Expansion: There is significant room for the development of additional features in the web-based API. In particular, for (i) on-demand visualization tasks, (ii) on-demand analysis tasks, and (iii) client-side, browser based tools for data exploration and visualization. For example, (i) requesting an image of projected gas density for a given halo, (ii) requesting a power-law radial slope measurement of a stellar halo or best-fit NFW parameters, and (iii) an interactive 3D representation of the subhalos within a given halo. We welcome community input and direct contributions in any of these directions.