BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160726Z
LOCATION:D165
DTSTART;TZID=America/Chicago:20181111T111500
DTEND;TZID=America/Chicago:20181111T114500
UID:submissions.supercomputing.org_SC18_sess147_ws_cafcw112@linklings.com
SUMMARY:The Gen3 Approach to Portability and  Repeatability for Cancer Gen
 omics Projects
DESCRIPTION:Workshop\nApplications, Deep Learning, Exascale, Workshop Reg 
 Pass\n\nThe Gen3 Approach to Portability and  Repeatability for Cancer Gen
 omics Projects\n\nFlamig, Tang, Grossman\n\nThe Gen3 software stack is a o
 pen-source platform for managing, analyzing, and sharing petabyte-scale re
 search data. In this note, we describe the approach that we have used with
  Gen3 to support portability and repeatibility for cancer genomics project
 s. Data in a Gen3 data commons is divided into projects. Project data is o
 f two types: large files, such as BAM files and image files, that are mana
 ged as data objects and stored in one or more private and public clouds, a
 nd all of the other data associated with a project, including all of the t
 he clinical phenotype data and biospecimen data. We call this other data “
 core data” and have developed data serialization format for it, which incl
 udes versioning and schema information. Data objects are available across 
 multiple data commons, while core data can be exported and imported using 
 the serialization format. In this way, we support portability for data pro
 jects. We support repeatibility by representing workflows using the Common
  Workflow Language (CWL) and managing the CWL files as data objects. With 
 this approach, we simply need to manage and version the data objects, core
  data, and CWL files associated with a project.
URL:https://sc18.supercomputing.org/presentation/?id=ws_cafcw112&sess=sess
 147
END:VEVENT
END:VCALENDAR

