Downloading a Subvolume of Snapshot Data Using the API (or Lab)
Victor Roberto Soares da Silva
6 Feb
Hi Dylan,
I am aware that the Web-based API allows users to load only specific fields from a snapshot. For instance, I am currently analyzing the spatial distribution of neutral hydrogen abundance using the TNG50-1 simulation. I adapted task 11 from API cookbook to download only the necessary data instead of the entire snapshot.
However, even when downloading just a few fields, the high resolution of TNG50-1 makes the file sizes quite large. I am wondering if it is possible to use the API to load only a subsection of the simulation volume (for example, a 10 Mpc sub-box within the full 50 Mpc box). I have not been able to find any documentation or examples addressing this scenario.
Could you please advise if there is any mechanism available to download data of cells (not halos or subhalos) for a smaller region, ideally to reduce the local storage requirements? As JupyterLab does not seem to support high-resolution boxes, I would need to perform this operation locally.
Thank you in advance for your help.
Dylan Nelson
6 Feb
There isn't any API functionality to do this, the reason is that it is a demanding task that requires loading the entire snapshot (at least once).
However, you could easily do this yourself in the Lab. For example, the following pseudo code:
path = 'sims.TNG/TNG50-1/output/snapdir_099/
outpath = 'cutout/'
for file in glob.glob(path + '*.hdf5'):
outfile = h5py.File(outpath + file,'w')
with h5py.File(path + file,'r') as infile:
pos = infile['PartType0/Coordinates'][()]
w = np.where(pos inside desired sub-region)
for field in ['Density','NeutralHydrogenAbundance,'others of interest']:
outfile['PartType0/' + field] = infile['PartType0/' + field][()]
outfile.cose()
You can finish this, but it would easily create a "copy" of the snapshot, with only a few fields of interest, and only within a spatial subregion. This will work in the Lab easily, since you process each file chunk separately, so the memory usage is never large.
Hi Dylan,
I am aware that the Web-based API allows users to load only specific fields from a snapshot. For instance, I am currently analyzing the spatial distribution of neutral hydrogen abundance using the TNG50-1 simulation. I adapted task 11 from API cookbook to download only the necessary data instead of the entire snapshot.
However, even when downloading just a few fields, the high resolution of TNG50-1 makes the file sizes quite large. I am wondering if it is possible to use the API to load only a subsection of the simulation volume (for example, a 10 Mpc sub-box within the full 50 Mpc box). I have not been able to find any documentation or examples addressing this scenario.
Could you please advise if there is any mechanism available to download data of cells (not halos or subhalos) for a smaller region, ideally to reduce the local storage requirements? As JupyterLab does not seem to support high-resolution boxes, I would need to perform this operation locally.
Thank you in advance for your help.
There isn't any API functionality to do this, the reason is that it is a demanding task that requires loading the entire snapshot (at least once).
However, you could easily do this yourself in the Lab. For example, the following pseudo code:
You can finish this, but it would easily create a "copy" of the snapshot, with only a few fields of interest, and only within a spatial subregion. This will work in the Lab easily, since you process each file chunk separately, so the memory usage is never large.