Interactively exploring a DST

Learning Objectives

  • Open a DST in an interactive python session

  • Print all nodes in a DST

  • Explore the contents of the TES

  • Inspect a track

  • Inspect a stripping location

Data is stored in files called DSTs, which are processed by DaVinci to make nTuples. However you can also explore them interactively from a Python session.

This is particularly useful if you want to quickly find something out, or the more complex processing in DaVinci is not working as expected.

The file we downloaded from the grid contains simulated data, with stripping and trigger decisions and so on. In this lesson we assume the file you downloaded is called 00070793_00000001_7.AllStreams.dst.

To take a look at the contents of the TES, we need to write a small Python file:

import sys

import GaudiPython as GP
from GaudiConf import IOHelper
from Configurables import DaVinci

dv = DaVinci()
dv.DataType = '2016'
dv.Simulation = True

# Retrieve file path to open as the last command line argument
inputFiles = [sys.argv[-1]]
IOHelper('ROOT').inputFiles(inputFiles)

appMgr = GP.AppMgr()
evt = appMgr.evtsvc()

appMgr.run(1)

We first configure the application using the DaVinci configurable, set up the input data, and then create the application manager and tell it to start running.

Place this into a file called explore.py and run the following command in a new terminal:

$ lb-run --ext=ipython DaVinci/v45r8 ipython -i explore.py 00070793_00000001_7.AllStreams.dst

This will configure the application, initialise it, and put you in a Python prompt. We are now ready to explore the TES, which is accessible via the evt variable. We can start by printing out some of the TES locations which exist for this event.

>>> evt.dump()

You could look at the properties of some tracks in the first event by typing inside the Python session:

>>> tracks = evt['/Event/Rec/Track/Best']
>>> print(tracks[0])

The next question is, how do you know what TES locations that could exist? As we saw evt.dump() prints a few of them, but not all, including /Event/Rec/Track/Best.

We can use the directory-like structure of the TES to walk through the hierachy from the root node.

>>> root = evt['/Event']
>>> evt.leaves(root)
<ROOT.vector<IRegistry*> object at 0x11d1deb0
>>> root_children = evt.leaves()
>>> root_children.size()
23L
>>> root_children[0]
<ROOT.DataSvcHelpers::RegistryEntry object at 0x126576b0>
>>> root_children[0].identifier()
'/Event/Gen'

So from the root node we can find the locations of the children! We can put this into a little recursive function that walks through all children of the root, all the children of each child, and so on.

def nodes(evt, root='/Event'):
    """List all nodes in `evt` starting from `node`."""
    node_list = [root]
    for leaf in evt.leaves(evt[root]):
        node_list += nodes(evt, leaf.identifier())
    return node_list

The easiest way to use this is to add it to your explore.py script and re-run as before. Then, in your IPython session, enter nodes(evt).

This will list a large number of TES locations which are present in the input file. But it still doesn’t show /Event/Rec/Track/Best! Why not?

This is because locations can be created on demand, only created when we first try to access them. Such locations are typically ‘packed’ in the input file, for example: /Event/pRec/Track/Best. The objects inside these packed locations get ‘unpacked’ to some other location, which happens auto-magically when you try to access the unpacked location. Often you can just try removing the small p from the location (/Event/Rec/Track/Best) to trigger the unpacking. Other times you have to make sure the application is correctly configured to enable unpacking.

You can also inspect the particles and vertices built by your stripping line. However not every event will contain a candidate for your line, so the second tool we need is something that will advance us through events until the stripping decision was positive:

def advance(decision):
    """Advance until stripping decision is true, returns
    number of events by which we advanced"""
    n = 0
    while evt['/Event']:
        reports = evt['/Event/Strip/Phys/DecReports']
        if reports.hasDecisionName('Stripping{}Decision'.format(decision)):
            break

        appMgr.run(1)
        n += 1
    return n

Add this to your script and restart ipython as before.

Detecting file ends

To avoid an infinite loop, we check that we still have events to process by seeing if evt['/Event'] exists. This root location is present in every event in the input file; if it doesn’t exist, we’ve run out of events to process.

Using the name of our stripping line we can now advance through the DST until we reach an event which contains a candidate:

>>> line = 'D2hhPromptDst2D2KKLine'
>>> advance(line)

The candidates built for you can now be found at /Event/AllStreams/Phys/D2hhPromptDst2D2KKLine/Particles:

>>> candidates = evt['/Event/AllStreams/Phys/{}/Particles'.format(line)]
>>> print(candidates.size())

This tells you how many candidates there are in this event and you can access the first one with:

>>> print(candidates[0])

Which will print out some information about the Particle. In our case a \(` D^{* +} `\) (particle ID number 413). You can access its decay products with candidates[0].daughtersVector()[0] and candidates[0].daughtersVector()[1], which will be a \(` D^{0} `\) and a \(` \pi^{+} `\).

There is a useful tool for printing out decay trees, which you can pass the top level particle to and it will print out the full decay tree etc:

>>> print_decay = appMgr.toolsvc().create(
...   'PrintDecayTreeTool', interface='IPrintDecayTreeTool'
... )
...
>>> print_decay.printTree(candidates[0])

With our candidates in hand, it would be nice to be able to retrieve and compute the variables we need for an analysis. On to LoKi functors!

Fast DST browsing

While here we have discussed for pedagogical reasons all the configuration options needed in order to browse a DST file, in your daily life as a physicist it is often useful to use the bender application that belongs to the Bender project.

For example, to explore the DST we could have simply done:

lb-run Bender/latest bender -d 2016 00070793_00000001_7.AllStreams.dst

where -d 2016 is the DataType flag. This leaves us in a prompt in which we can proceed as discussed in this lesson, with the advantage that some functions are already provided for us, such as seekStripDecision (which replaces our advance) or ls and get, which allow to list and get TES locations. Other examples of useful functions are listed in the bender starting banner.

Bender also provides a useful command dst-dump, which is a quick way of figuring out what objects are present on a DST and where. Try out:

lb-run Bender/latest dst-dump -d 2016 -f -n 100 00070793_00000001_7.AllStreams.dst

The -f option tells Bender to try and “unpack” the locations such as /Event/AllStreams/pPhys/Particles that we mentioned above, while -n 100 tells it to only process the first 100 events on the DST. Give this a try if you’re ever stuck figuring out where your candidates are hiding!

Configuration debugging

You can run an interactive exploration session using whatever application configuration you like, which can come in handy if your DecayTreeTuple isn’t behaving how you expect.

As long as all your configuration comes before the GP.AppMgr() line, that configuration will be included in the exploration session.