A Multi-Science Data Analysis Platform and the GeneROOT Use Case

Europe/Zurich
31/3-004 - IT Amphitheatre (CERN)

31/3-004 - IT Amphitheatre

CERN

105
Show room on map
Ignacio Reguero
Description

Openlab lifesciences computing project

Webcast
There is a live webcast for this event
    • 10:00 10:50
      A Multi-Science Data Analysis Platform and the GeneROOT Use Case 50m

      This talk will cover two areas of current research in the context of knowledge sharing between CERN openlab and the life science communities. The first area covers the development and prototyping of a multi-science data analysis platform build up around CERN developed technologies like, Zenodo, REANA and CVMFS. When finished this platform will support a complete data analysis life-cycle from data discovery, to data access, to data processing to end-user data analysis. The second area covers a specific use case, where HEP specific software like ROOT is used to store and process genomics data sequences. There are a number of handcrafted genomics data formats being used, like FASTQ, SAM, BAM, CRAM, etc. They range from pure ASCII to compressed binary formats. We will compare the features of these formats with the generic capabilities of ROOT’s TTree containers. Also we will show performance numbers of typical analysis scenarios.

      Speakers: Fons Rademakers (CERN), Taghi Aliyev (Universiteit Maastricht (NL))