Illustris - Loading the TNG100-1 data

Qin PENG

1
8 Jul '19

Hi Dylan,

When I want to load the data of TNG100-1 by the python script from you, the program will be killed because my machine has no enough memory.
Such as:
gas_pos = il.snapshot.loadSubset(basePath,snapNum,'gas','Coordinates')
Killed

Is there any way to load the data in batches?

Dylan Nelson

9 Jul '19

Hi,

Yes the il.snapshot.loadSubset() function takes a subset keyword, I guess this I added mainly without documentation. You can use it in a loop to load e.g. 1% of the particles at a time.

I make subset like this, and then pass it to the load function:

    indRange = [0, 500000] # particle index range

    if indRange is not None:
        # load a contiguous chunk by making a subset specification in analogy to the group ordered loads
        subset = { 'offsetType'  : np.zeros(sP.nTypes, dtype='int64'),
                   'lenType'     : np.zeros(sP.nTypes, dtype='int64'),
                   'snapOffsets' : snapOffsetList(sP) }

        subset['offsetType'][ptNum(partType)] = indRange[0]
        subset['lenType'][ptNum(partType)]    = indRange[1]-indRange[0]+1

    data = il.snapshot.loadSubset(basePath, snapNum, partType, fields, subset=subset)

def snapOffsetList(sP):
    """ Make the offset table (by type) for the snapshot files, to be able to quickly determine within
        which file(s) a given offset+length will exist. Note: I cache these results to disk for speed. """
        nChunks = snapNumChunks(sP.simPath, sP.snap, sP.subbox)
        snapOffsets = np.zeros( (sP.nTypes, nChunks), dtype='int64' )

        for i in np.arange(1,nChunks+1):
            f = h5py.File( snapPath(sP.simPath,sP.snap,chunkNum=i-1,subbox=sP.subbox), 'r' )

            if i < nChunks:
                for j in range(sP.nTypes):
                   snapOffsets[j,i] = snapOffsets[j,i-1] + f['Header'].attrs['NumPart_ThisFile'][j]

                f.close()

    return snapOffsets

Dylan Nelson

2
9 Jul '19

Note that the default memory limit of the JupyterLab instances is 10GB, so loading the positions of all the gas for TNG100-1 (at once) is too much, this would require 1820^3*3*8 bytes = 135 GB of memory.

Public Data Access Overview / Discussion Forum

Loading the TNG100-1 data