Simple question about retrieving simulated 'galaxy catalogues'
Karina Caputi
7 May '21
I'm starting to play with your wonderful simulations and, as an observational
astronomer who knows very little about cosmological simulations, I am a bit
overwhelmed. So I have a pretty basic question...I will appreciate very much
if you could help.
I simply would like to retrieve 'galaxy catalogues' corresponding to TNG100.
I'm using the Public Data Access tool, which seems very handy. I select the
simulation, snapshot and define my galaxy selection criterion. And then a list
of subhaloes appears, each of which is associated with a galaxy (if not dark),
if I understand correctly. However, only 100 subhaloes appear on the screen
at the same time, so I don't know how to retrieve the info for all the entries
in one go.
So here is my naive question: is there anyway of redirecting the output, with
all the subhalo entries, to a file that I can save in my local disk? I would like if
possible the output with the format that I see on the screen, which is very
simple and just enough for my purposes.
I know that you can download the whole snapshots, but I'd rather not, as
they have much more information than what I really need and would
overcomplicate my work unnecessarily.
Any advice you can give me will be very welcome.
Thanks a lot in advance. Best wishes,
Karina Caputi.
Dylan Nelson
10 May '21
Hello Karina,
There are two options right now:
(1) You can start to follow the API getting started guide. This shows how to use python to get a list of subhalos (as you say, galaxies). Look at "Task 2" at the bottom for an example of a simple search. The results here are also paginated, so you need to download each "page" of results, as described.
(2) For more flexibility, you can download the entire group catalog. This isn't the same as downloading the entire snapshot, but is much smaller, being just a catalog of halos and subhalos. Then you can search and load the properties you are interested in directly, without using the website anymore. The top "Group Catalogs" section of the Scripts Getting Started guide describes this. Note that if these files are still too big, you can download just individual fields from the group catalogs (if you only need to search/look at a few, this may be the way to go). For example
(3) If you can describe more what you would want by "redirecting the output", I may be able to add something like this to the "Search Galaxy/Halo Catalogs" tool.
Karina Caputi
12 May '21
Many thanks for your suggestions, Dylan.
After a few tries, I'm working with your option (1). I'm overcoming the pagination issue just using the 'limit' parameter in the get(url) instruction.
For long catalogues, I found that the risk is that the connection might be interrupted in the middle of the download, which forces one to start again. To avoid this problem, I'm now using two python scripts: one that downloads sequentially all sub-halo url, which is very quick to run. And then a second one that gets each url. If this second run is interrupted, I can easily identify which was the last read url and then restart my script from there.
In the future, if you could add a button to download the basic catalogue seen on the screen in the subhalo query web form, with all its entries (without pagination), it would be very useful for the most naive users like me, who only want to retrieve galaxy properties. By 'redirecting the output' I was just meaning to be able to download/save locally the entire sub-halo catalogue obtained with the conditions specified on the web form.
I'm starting to play with your wonderful simulations and, as an observational
astronomer who knows very little about cosmological simulations, I am a bit
overwhelmed. So I have a pretty basic question...I will appreciate very much
if you could help.
I simply would like to retrieve 'galaxy catalogues' corresponding to TNG100.
I'm using the Public Data Access tool, which seems very handy. I select the
simulation, snapshot and define my galaxy selection criterion. And then a list
of subhaloes appears, each of which is associated with a galaxy (if not dark),
if I understand correctly. However, only 100 subhaloes appear on the screen
at the same time, so I don't know how to retrieve the info for all the entries
in one go.
So here is my naive question: is there anyway of redirecting the output, with
all the subhalo entries, to a file that I can save in my local disk? I would like if
possible the output with the format that I see on the screen, which is very
simple and just enough for my purposes.
I know that you can download the whole snapshots, but I'd rather not, as
they have much more information than what I really need and would
overcomplicate my work unnecessarily.
Any advice you can give me will be very welcome.
Thanks a lot in advance. Best wishes,
Karina Caputi.
Hello Karina,
There are two options right now:
(1) You can start to follow the API getting started guide. This shows how to use python to get a list of subhalos (as you say, galaxies). Look at "Task 2" at the bottom for an example of a simple search. The results here are also paginated, so you need to download each "page" of results, as described.
(2) For more flexibility, you can download the entire group catalog. This isn't the same as downloading the entire snapshot, but is much smaller, being just a catalog of halos and subhalos. Then you can search and load the properties you are interested in directly, without using the website anymore. The top "Group Catalogs" section of the Scripts Getting Started guide describes this. Note that if these files are still too big, you can download just individual fields from the group catalogs (if you only need to search/look at a few, this may be the way to go). For example
downloads just the "SubhaloSFR" field.
(3) If you can describe more what you would want by "redirecting the output", I may be able to add something like this to the "Search Galaxy/Halo Catalogs" tool.
Many thanks for your suggestions, Dylan.
After a few tries, I'm working with your option (1). I'm overcoming the pagination issue just using the 'limit' parameter in the get(url) instruction.
For long catalogues, I found that the risk is that the connection might be interrupted in the middle of the download, which forces one to start again. To avoid this problem, I'm now using two python scripts: one that downloads sequentially all sub-halo url, which is very quick to run. And then a second one that gets each url. If this second run is interrupted, I can easily identify which was the last read url and then restart my script from there.
In the future, if you could add a button to download the basic catalogue seen on the screen in the subhalo query web form, with all its entries (without pagination), it would be very useful for the most naive users like me, who only want to retrieve galaxy properties. By 'redirecting the output' I was just meaning to be able to download/save locally the entire sub-halo catalogue obtained with the conditions specified on the web form.
Thanks again!
Best wishes,
Karina.