The merger numbers of the sublink_gal merger trees

Qin PENG
  • 25 Oct

Hi Dylan,

I want to use the sublink_gal merger trees to study the merger histories. I used the il.sublink.numMergers function to calculate the merger numbers. To my surprise, the results are different from the merger history catalog (http://www.tng-project.org/api/TNG100-1/files/merger_history/) which claims to adopt the sublinkgal code (Eisert et al., 2022). For the major mergers (stellar mass ratio > 1/4) of snapshot 099, the results are as below,

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=0,fields=fields,treeName='SubLink_gal')
il.sublink.numMergers(tree,minMassRatio=1/4)
4

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=1,fields=fields,treeName='SubLink_gal')
il.sublink.numMergers(tree,minMassRatio=1/4)
5

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=2,fields=fields,treeName='SubLink_gal')
il.sublink.numMergers(tree,minMassRatio=1/4)
2

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=3,fields=fields,treeName='SubLink_gal')
il.sublink.numMergers(tree,minMassRatio=1/4)
6

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=4,fields=fields,treeName='SubLink_gal')
il.sublink.numMergers(tree,minMassRatio=1/4)
5

However, the results come from the merger history catalog are

f['NumMajorMergersTotal'][:5]
array([7, 8, 3, 7, 5], dtype=uint32)

I noticed that the merger history catalog have added extra constrains (e.g. not double-counting flybys, dealing with numerical artifacts, etc.), so the merger numbers should less than the results come from illustris_python. However, what we see is the oppisite. Could you tell me how the merger numbers are calculated?

I am looking forward to your reply. Thank you.

Greetings.

Lukas Eisert
  • 30 Oct

Hi Qin Peng,

Thank you for highlighting the difference between the merger catalogs and the results from the il.sublink.numMergers function.

It’s worth noting that the code behind the merger catalogs differs from the one used in il.sublink.numMergers. Both methods trace back along the main progenitor branch and count mergers with a mass ratio above a specified threshold. (As you mentioned, the merger catalogs also impose additional constraints.)

After looking into this further, I suspect there may be a difference in how the mass ratio is defined. In the code I used for the merger catalogs in Eisert et al. (2022), the mass ratio is defined at the snapshot where the secondary reaches its maximum mass. For example, if the secondary’s maximum mass occurs at snapshot 76, then the mass ratio is calculated as:

mass ratio = secondary mass at 76 / primary mass at 76

In il.sublink.numMergers (Illustris Python Repository: https://github.com/illustristng/illustris_python/blob/master/illustris_python/sublink.py), it seems that the primary and secondary masses are each measured at their respective maximum mass snapshots. For instance, if the secondary’s maximum mass is at snapshot 76 and the primary’s at snapshot 78, the mass ratio would be calculated as:

mass ratio = secondary mass at 76 / primary mass at 78

If this is indeed the case, the mass ratios calculated by il.sublink.numMergers would generally be smaller than those from the merger catalog code, which could explain the lower count of major mergers. This is my suspicion at this point, I’ll take a closer look to confirm.

I hope this helps for now!

Best,
Lukas

Qin PENG
  • 1
  • 31 Oct

Hi Lukas,

Thanks for your kindly reply.

I am no sure whether your codes are inherited from Rodriguez-Gomez (see Example 7 of https://vrodgom.github.io/merger_trees/examples.html). However, the basic algorithm of this example is not different from il.sublink.numMergers. They both defind the stellar masses of first progenitor and next progenitor as the values at the time of their respective maxima.

Anyway, I have referred the method mentioned in https://www.tng-project.org/data/forum/topic/662/nummerger-in-sublink/ to use the stellar masses of first progenitor and next progenitor at the same time when the mass of the next progenitor is at its maximum (Rodriguez-Gomez et al., 2015). In addition, just like the merger history catalog, I also excluded the first progenitors and next progenitors when they belonged to the same FoF group. My codes are as below,

def RG_maxPastMass(tree, index, partType='stars'):
    """ Get maximum past mass (of the given partType) along the main branch of a subhalo
        specified by index within this tree. """
    ptNum = il.util.partTypeNum(partType)

    branchSize = (tree['MainLeafProgenitorID']).take(index) - (tree['SubhaloID']).take(index) + 1
    masses = tree['SubhaloMassType'][index: index + branchSize, ptNum]

    return np.max(masses), \
           (tree['SnapNum'][index: index + branchSize]).take(np.argmax(masses)), \
           (tree['SubfindID'][index: index + branchSize]).take(np.argmax(masses)), \
           (tree['SubhaloGrNr'][index: index + branchSize]).take(np.argmax(masses))

def RG_massAtSnap(tree, index, snapNum, partType='stars'):
    """ Get mass (of the given partType) along the main branch of a subhalo
    at a specific snap shot. """
    ptNum = il.util.partTypeNum(partType)

    branchSize = (tree['MainLeafProgenitorID']).take(index) - (tree['SubhaloID']).take(index) + 1
    masses = tree['SubhaloMassType'][index: index + branchSize, ptNum]

    snaps = tree['SnapNum'][index: index + branchSize]

    ids = tree['SubfindID'][index: index + branchSize]
    GNs = tree['SubhaloGrNr'][index: index + branchSize]

    if (snapNum in snaps):
        idx = (snaps == snapNum)
        mass_at_snap = masses[idx][0]
        id_at_snap = ids[idx][0]
        GN_at_snap = GNs[idx][0]

    elif (snapNum < np.min(snaps)):
        mass_at_snap = masses[-1]
        # id_at_snap = ids[-1]
        # GN_at_snap = GNs[-1]
        id_at_snap = -1
        GN_at_snap = -1        

    elif (snapNum > np.max(snaps)):
        mass_at_snap = masses[0]
        # id_at_snap = ids[0]
        # GN_at_snap = GNs[0]    
        id_at_snap = -1
        GN_at_snap = -1        

    else:
        ind_inter = np.digitize(snapNum,snaps)
        mass_at_snap = 0.5*(masses.take(ind_inter-1)+masses.take(ind_inter))    
        id_at_snap = -1
        GN_at_snap = -1

    return mass_at_snap, id_at_snap, GN_at_snap

def RG_numMergers(tree, snapNum_0=99, snapNum_i=0, minMassRatio=1e-10, massPartType='stars', index=0, alongFullTree=False):
    """ Calculate the number of mergers, along the main progenitor branch, in this sub-tree
    (optionally above some mass ratio threshold). If alongFullTree, count across the full
    sub-tree and not only along the MPB. """
    # verify the input sub-tree has the required fields
    reqFields = ['SnapNum', 'SubhaloID', 'SubfindID', 'NextProgenitorID', 'MainLeafProgenitorID',
                 'FirstProgenitorID', 'SubhaloMassType', 'SubhaloMassInRadType', 'SubhaloGrNr']

    if not set(reqFields).issubset(tree.keys()):
        raise Exception('Error: Input tree needs to have loaded fields: '+', '.join(reqFields))

    num = 0
    invMassRatio = 1.0 / minMassRatio

    # walk back main progenitor branch
    rootID = (tree['SubhaloID']).take(index)
    fpID   = (tree['FirstProgenitorID']).take(index)

    SnapNum_merger = (tree['SnapNum']).take(index)
    SubfindID_m = (tree['SubfindID']).take(index)


    while fpID != -1:                        
        fpIndex = index + (fpID - rootID)

        if SnapNum_merger <= snapNum_0 and SnapNum_merger > snapNum_i:

            # explore breadth
            npID = (tree['NextProgenitorID']).take(fpIndex)                

            while npID != -1:          

                npIndex = index + (npID - rootID)

                npMass, npSnap, SubfindID_n, SubhaloGrNr_n = RG_maxPastMass(tree, npIndex, massPartType)

                # count if both masses are non-zero, and ratio exceeds threshold

                if npMass > 0.0:

                    fpMass, SubfindID_f, SubhaloGrNr_f = RG_massAtSnap(tree, fpIndex, npSnap, massPartType)

                    if fpMass > 0.0 and SubhaloGrNr_f != SubhaloGrNr_n:

                        ratio = npMass / fpMass

                        if ratio >= minMassRatio and ratio <= invMassRatio:                                        
                            num += 1

                npID = (tree['NextProgenitorID']).take(npIndex)

                # count along full tree instead of just along the MPB? (non-standard)
                if alongFullTree:
                    if (tree['FirstProgenitorID']).take(npIndex) != -1:
                        numSubtree = RG_numMergers(tree, snapNum_0=snapNum_0, snapNum_i=snapNum_i, minMassRatio=minMassRatio, massPartType=massPartType, index=npIndex)
                        num += numSubtree

        fpID = (tree['FirstProgenitorID']).take(fpIndex)

        SnapNum_merger = (tree['SnapNum']).take(fpIndex)
        SubfindID_m = (tree['SubfindID']).take(fpIndex)

    return num

For the mergers of any mass ratio at z=0, the results are as below,

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=0,fields=fields,treeName='SubLink_gal')
RG_numMergers(tree)
2075

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=1,fields=fields,treeName='SubLink_gal')
RG_numMergers(tree)
212

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=2,fields=fields,treeName='SubLink_gal')
RG_numMergers(tree)
251

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=3,fields=fields,treeName='SubLink_gal')
RG_numMergers(tree)
49

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=4,fields=fields,treeName='SubLink_gal')
RG_numMergers(tree)
60

What I expected is that these merger numbers should more than the results come from the merger history catalog which are applied more constains. However, the merger numbers come from the merger history catalog are

f['NumMergersTotal'][:10]
array([2269,  257,  326,   67,   80,  132,  231,   54,  138,   95], dtype=uint32)

Could anyone help me check these results and tell me why it is?
Thank you so much.

Greetings.

Lukas Eisert
  • 1
  • 31 Oct

@Qin PENG said:
I am no sure whether your codes are inherited from Rodriguez-Gomez (see Example 7 of https://vrodgom.github.io/merger_trees/examples.html). However, the basic algorithm of this example is not different from il.sublink.numMergers. They both defind the stellar masses of first progenitor and next progenitor as the values at the time of their respective maxima.

As said, the code behind the merger catalogs is different from the one in il.sublink.numMergers and is indeed based on Rodriguez-Gomez+2015.
You can find an older version without the additional constraints here: https://github.com/nelson-group/sublink/blob/master/Examples/merger_history.cpp

In addition, just like the merger history catalog, I also excluded the first progenitors and next progenitors when they belonged to the same FoF group.

Im not sure if i missunderstand you but this is not happening for the merger history catalogs: We also count subhalos which merge within the same FoF group.
Here is the original code of additional constrains we impose:

      //Only proceed if SubhaloFlag is set
      if (!next_prog.data().SubhaloFlag)
        continue;

      //Only proceed if secondary was alive for more than 2 snapshots
      if ((next_prog.snapnum() - next_prog.main_leaf_progenitor().snapnum() <= 1) &&
          (next_prog.snapnum() > 1)){
        continue;   
      }

      //Only proceed if secondary had > 50 stellar particles in at least one previous snapshot (along the main branch)
      if (next_prog.data().SubhaloLenType[4] < min_num_mpb_stellar_particles)
      {

        //Walk along main branch of secondary until a progenitor with > min_num particles is found
        bool too_few_particles = true;
        auto first_next_prog = next_prog.first_progenitor();
        while (first_next_prog.is_valid())
        {

          if(first_next_prog.data().SubhaloLenType[4] >= min_num_mpb_stellar_particles){
            too_few_particles = false;
            break;
            }

          //Iterate to the next progenitor along the scondary branch
          first_next_prog = first_next_prog.first_progenitor();

        }

        if (too_few_particles)
          continue;

      }

I hope this helps
Lukas

Qin PENG
  • 31 Oct

Hi Lukas,

Thanks for your prompt reply.

According to the original code, you really did not classify whether the subhaloes merge within the same FoF. However, what is written in https://www.tng-project.org/data/docs/specifications/ is
`To avoid spurious flyby and re-merger events, mergers are only included when both galaxies can be tracked back to a time when each of them belonged to a different FoF group.'
I am confused whether I properiy understand the meaning. Maybe someone could tell me what is wrong with it.

Thanks.

Greetings.

Lukas Eisert
  • 1
  • 31 Oct

Ah, I understand now what you’re referring to. We only consider mergers where the merging subhalos were in different FoF groups at some point (!) in the past. However, at the time of merging, the two subhalos are allowed to be in the same FoF group.

In the "merger catalog" code (https://github.com/nelson-group/sublink/blob/master/Examples/merger_history.cpp), this is enforced with the following check:

      // Only proceed if infall is well defined
      auto snapnum_infall = infall_pair.second.snapnum();
      if (snapnum_infall == -1)
        continue;

Infall means, that both subhalos become member of the same FoF.

In your code:

npMass, npSnap, SubfindID_n, SubhaloGrNr_n = RG_maxPastMass(tree, npIndex, massPartType)
fpMass, SubfindID_f, SubhaloGrNr_f = RG_massAtSnap(tree, fpIndex, npSnap, massPartType)
if fpMass > 0.0 and SubhaloGrNr_f != SubhaloGrNr_n:

the merger is ignored if the subhalos are in the same FoF group at the time when the secondary reaches its maximum mass. This is a much stronger statement.

Lukas

Qin PENG
  • 1 Nov

Hi Lukas,

Thanks for your help.

I think I have known the constrains you want to impose. However, I do not understand why snapnum_infall == -1 is the mergering subhaloes come from the same FoF. The all elements of tree['SnapNum'] shall in range of [0,99].

Could you explain it? Thank you.

Greeting.

Lukas Eisert

@Qin PENG said:
Hi Lukas,

Thanks for your help.

I think I have known the constrains you want to impose. However, I do not understand why snapnum_infall == -1 is the mergering subhaloes come from the same FoF. The all elements of tree['SnapNum'] shall in range of [0,99].

Could you explain it? Thank you.

Greeting.

Hi!

Infall means that the two subhalos become members of the same FoF halo (and that they have not been members of the same FoF before infall). This infall can happen before, at or after t_max; important is only that an infall has happened in the past.

The line

auto snapnum_infall = infall_pair.second.snapnum();

traces back along the progenitors of the primary/secondary subhalos (i.e. back in time) and returns the snapshot at which the progenitors are not in the same FoF for the first time. This snapshot is the snapshot of the infall and will be in the range of [0,99]. However, if an infall never happened it returns an invalid snapshot of -1.

So

snapnum_infall == -1

means: no infall. And no infall means that primary and secondary have always been in the same FoF.
As said, we do this to exclude false identification of mergers: e.g. if Subfind splits up a subhalo at snapshot n and re-merges it again at snapshot n+1. In this case there would be no infall and we would not count it as a merger.

I hope this clarifies
Lukas

Qin PENG

Hi Lukas,

Thank you very much for your patient explanation.

I have tried my best to reproduce the results of the merger history catalog with illustris_python. My codes are as below,

def RG_maxPastMass(tree, index, partType='stars'):
    """ Get maximum past mass (of the given partType) along the main branch of a subhalo
        specified by index within this tree. """
    ptNum = il.util.partTypeNum(partType)

    branchSize = (tree['MainLeafProgenitorID']).take(index) - (tree['SubhaloID']).take(index) + 1
    masses = tree['SubhaloMassType'][index: index + branchSize, ptNum]

    return np.max(masses), (tree['SnapNum'][index: index + branchSize]).take(np.argmax(masses))

def RG_massAtSnap(tree, index, snapNum, partType='stars'):
    """ Get mass (of the given partType) along the main branch of a subhalo
    at a specific snap shot. """
    ptNum = il.util.partTypeNum(partType)

    branchSize = (tree['MainLeafProgenitorID']).take(index) - (tree['SubhaloID']).take(index) + 1
    masses = tree['SubhaloMassType'][index: index + branchSize, ptNum]

    snaps = tree['SnapNum'][index: index + branchSize]

    if (snapNum in snaps):
        idx = (snaps == snapNum)
        mass_at_snap = masses[idx][0]

    elif (snapNum < np.min(snaps)):
        mass_at_snap = masses[-1]   

    elif (snapNum > np.max(snaps)):
        mass_at_snap = masses[0]   

    else:
        ind_inter = np.digitize(snapNum,snaps)
        mass_at_snap = 0.5*(masses.take(ind_inter-1)+masses.take(ind_inter))    

    return mass_at_snap

def RG_PastSnaps(tree, index):

    branchSize = (tree['MainLeafProgenitorID']).take(index) - (tree['SubhaloID']).take(index) + 1

    snaps = tree['SnapNum'][index: index + branchSize]
    GNs = tree['SubhaloGrNr'][index: index + branchSize]

    return snaps, GNs

def RG_infall(fpSnaps, fpGNs, npSnaps, npGNs):

    fpSnaps_in = fpSnaps[np.isin(fpSnaps,npSnaps)]

    fpGNs_in = fpGNs[np.isin(fpSnaps,npSnaps)]
    npGNs_in = npGNs[np.isin(npSnaps,fpSnaps)]

    if len(fpGNs_in[fpGNs_in == npGNs_in]) == len(fpGNs_in) and len(fpGNs_in) != 0:
        infall_judge = False

    else:
        infall_judge = True

    return infall_judge

def RG_numMergers(tree, snapNum_0=99, snapNum_i=0, minMassRatio=1e-10, massPartType='stars', index=0, alongFullTree=False):
    """ Calculate the number of mergers, along the main progenitor branch, in this sub-tree
    (optionally above some mass ratio threshold). If alongFullTree, count across the full
    sub-tree and not only along the MPB. """
    # verify the input sub-tree has the required fields
    reqFields = ['SnapNum', 'SubhaloID', 'SubfindID', 'NextProgenitorID', 'MainLeafProgenitorID', 'SubhaloLenType',
                 'FirstProgenitorID', 'SubhaloMassType', 'SubhaloMassInRadType', 'SubhaloGrNr']

    if not set(reqFields).issubset(tree.keys()):
        raise Exception('Error: Input tree needs to have loaded fields: '+', '.join(reqFields))

    num = 0
    invMassRatio = 1.0 / minMassRatio

    # walk back main progenitor branch
    rootID = (tree['SubhaloID']).take(index)
    fpID   = (tree['FirstProgenitorID']).take(index)

    SnapNum_merger = (tree['SnapNum']).take(index)
    SubfindID_m = (tree['SubfindID']).take(index)
    SubhaloFlag_m = (il.groupcat.loadSubhalos(basePath,SnapNum_merger,fields='SubhaloFlag')).take(SubfindID_m)

    mlpSnap = tree['SnapNum'][tree['SubhaloID'] == (tree['MainLeafProgenitorID']).take(index)][0]

    sps = tree['SubhaloLenType'][:,4]

    min_num_mpb_stellar_particles = 50

    while fpID != -1 and SubhaloFlag_m:
        fpIndex = index + (fpID - rootID)

        fpSFID = (tree['SubfindID']).take(fpIndex)
        fpSnap = (tree['SnapNum']).take(fpIndex)

        fpSubhaloFlag = (il.groupcat.loadSubhalos(basePath,fpSnap,fields='SubhaloFlag')).take(fpSFID)

        fpSnaps, fpGNs = RG_PastSnaps(tree, fpIndex)

        if SnapNum_merger <= snapNum_0 and SnapNum_merger > snapNum_i and fpSubhaloFlag:

            # explore breadth
            npID = (tree['NextProgenitorID']).take(fpIndex)                

            too_few_particles = True

            np_id = npID

            while np_id != -1:

                npsp = sps[tree['SubhaloID'] == np_id][0]

                if npsp >= min_num_mpb_stellar_particles:
                    too_few_particles = False
                    break

                np_id = (tree['FirstProgenitorID']).take(index + (np_id - rootID))

            if npID != -1 and too_few_particles == False:

                npSFID = (tree['SubfindID']).take(index + (npID - rootID))
                SnapNum_n = (tree['SnapNum']).take(index + (npID - rootID))

                npSubhaloFlag = (il.groupcat.loadSubhalos(basePath,SnapNum_n,fields='SubhaloFlag')).take(npSFID)

                while npID != -1 and ((SnapNum_n-mlpSnap >= 2) or (SnapNum_n <= 1)) and npSubhaloFlag:            

                    npIndex = index + (npID - rootID)

                    npSnaps, npGNs = RG_PastSnaps(tree, npIndex)

                    infall_judge = RG_infall(fpSnaps, fpGNs, npSnaps, npGNs)

                    if infall_judge == True:

                        npMass, npSnap = RG_maxPastMass(tree, npIndex, massPartType)

                        if npMass > 0.0:    

                            fpMass = RG_massAtSnap(tree, fpIndex, npSnap, massPartType)

                            if fpMass > 0.0:

                                ratio = npMass / fpMass

                                if ratio >= minMassRatio and ratio <= invMassRatio:                                        
                                    num += 1

                    npID = (tree['NextProgenitorID']).take(npIndex)

                    # count along full tree instead of just along the MPB? (non-standard)
                    if alongFullTree:
                        if (tree['FirstProgenitorID']).take(npIndex) != -1:
                            numSubtree = RG_numMergers(tree, snapNum_0=snapNum_0, snapNum_i=snapNum_i, minMassRatio=minMassRatio, massPartType=massPartType, index=npIndex)
                            num += numSubtree

        fpID = (tree['FirstProgenitorID']).take(fpIndex)

        SnapNum_merger = (tree['SnapNum']).take(fpIndex)
        SubfindID_m = (tree['SubfindID']).take(fpIndex)
        SubhaloFlag_m = (il.groupcat.loadSubhalos(basePath,SnapNum_merger,fields='SubhaloFlag')).take(SubfindID_m)

        mlpSnap = tree['SnapNum'][tree['SubhaloID'] == (tree['MainLeafProgenitorID']).take(fpIndex)][0]

    return num

However, I still faild.

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=0,fields=fields,treeName='SubLink_gal')
RG_numMergers(tree)
1294

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=1,fields=fields,treeName='SubLink_gal')
RG_numMergers(tree)
87

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=2,fields=fields,treeName='SubLink_gal')
RG_numMergers(tree)
122

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=3,fields=fields,treeName='SubLink_gal')
RG_numMergers(tree)
9

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=4,fields=fields,treeName='SubLink_gal')
RG_numMergers(tree)
24
tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=0,fields=fields,treeName='SubLink_gal')
RG_numMergers(tree,minMassRatio=1/4)
5

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=1,fields=fields,treeName='SubLink_gal')
RG_numMergers(tree,minMassRatio=1/4)
3

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=2,fields=fields,treeName='SubLink_gal')
RG_numMergers(tree,minMassRatio=1/4)
3

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=3,fields=fields,treeName='SubLink_gal')
RG_numMergers(tree,minMassRatio=1/4)
2

tree = il.sublink.loadTree(basePath=basePath,snapNum=snapNum,id=4,fields=fields,treeName='SubLink_gal')
RG_numMergers(tree,minMassRatio=1/4)
2

Maybe I did not realize some important things or misunderstand your C++ codes. Sorry for bothering you again.

Greetings.

  • Page 1 of 1