e-Research Summer Hackfest - second edition

chaired by Roberto Barbera (University of Catania)
from to (Europe/Rome)
at Catania ( Aula E (1st floor) )
Department of Physics and Astronomy - Via S. Sofia, 64 - 95123 Catania
Description

 

“Bring your science to the web and the web to your science”

 

Overview and objectives

The second edition of the e-Research Summer Hackfest will be held at the Department of Physics and Astronomy of the University of Catania on July, 18-29, 2016.

The event is co-sponsored by the Sci-GaIA, INDIGO-DataCloud, and COST ENeL projects.

As for the first edition, the main objective of the event is to integrate scientific use cases through a pervasive adoption of web technologies and standards and make them available to their end users through Science Gateways (entities connected to distributed computing, data and services of interest to the Community of Practice the end users belong to). Promoting and fostering open and reproducible research will be the ultimate goal of the hackfest.    

Topics

The following topics will be tackled during the eResearch Summer Hackfest:

  • Big Data analytics;

  • Distributed computing services;

  • Distributed storage services;

  • Programmable access to Open Data repositories;

  • Semantic federation of Open Access repositories;

  • User interfaces (web, desktop, mobile, etc.);

  • Workflows.

Tools and technologies

The following tools and technologies will be showcased at the e-Research Summer Hackfest and used to implement the proposed use cases:

Contact

For all questions you may have regarding the Sci-GaIA e-Research Summer Hackfest, please contact us by email at summer-school@sci-gaia.eu.

Material:
Support Email: summer-school@sci-gaia.eu
Go to day
  • Monday, 18 July 2016
    • 08:30 - 09:00 Registration and badging of participants 30'
    • 09:00 - 09:30 The Sci-GaIA project and introduction to the hackfest 30'
      Speaker: Roberto Barbera (University of Catania)
      Material: Slides link Video lecture link
    • 09:30 - 10:00 The INDIGO-DataCloud project 30'
      Speaker: Giacinto Donvito (INFN)
      Material: Slides link Video lecture link
    • 10:00 - 11:30 Day 1 - The FutureGateway framework
      In this section will be first introduced the FutureGateway (FG) framework, described each of its components and explained how they work together. During the presentation will be also covered some security considerations related to the Science Gateway membership handling using as point of reference the standard baseline AAI mechanism provided by the standard FG installation and how to modify it to switch between already existing AAI mechanisms. The FutureGateway provides also a complete set of REST APIs to manage distributed computing resources, which will be briefly described.
      Conveners: Riccardo Bruno (INFN), Marco Fargetta (INFN)
      Material: Slides linkdown arrow Video lecture - part 1 link Video lecture - part 2 link Video lecture - part 3 link Video lecture - part 4 link
    • 11:30 - 12:00 Coffee break
    • 12:00 - 13:00 The FutureGateway framework - continued
      In this section will be shown how to setup a FutureGateway instance making use of several baseline installation scripts. It will be also shown how to manage, maintain and update the system. The usage of several REST APIs wil be also presented by real examples.
      Conveners: Riccardo Bruno (INFN), Marco Fargetta (INFN)
      Material: Slides link Video lecture - part 5 link Video lecture - part 6 link Video lecture - part 7 link
    • 13:00 - 14:00 Lunch break
    • 14:00 - 15:30 The INDIGO PaaS
      In this section the PaaS Layer architecture of the INDIGO-DataCloud Project will be described focusing on the technologies and solutions adopted for each component. 
      The interaction among the various pieces will be shown through the description of the main scenarios: the deployment of “IaaS automated services”, the deployment of a “PaaS service”, i.e. a Long-Running service (such as DBMS) or a user application to be run with specific input/output data. 
      Convener: Marica Antonacci (INFN)
      Material: Slides link Video lecture - part 1 link Video lecture - part 2 link Video lecture - part 3 link
    • 15:30 - 16:00 Coffee break
    • 16:00 - 17:00 The INDIGO PaaS - continued
      In this section some practical examples of the INDIGO PaaS usage will be shown.
      The starting point will be the TOSCA template that describes the topology of the services to be deployed.
      Then how to submit the template to the INDIGO Orchestrator will be shown along with the monitoring of the deployment status.
      Different typologies of TOSCA templates will be demonstrated, e.g. Galaxy template, Mesos cluster, execution of Chronos jobs, deployment of Marathon application, etc. 
      Convener: Marica Antonacci (INFN)
      Material: Screencasts linkdown arrow Slides link Video lecture - part 5 link Video tutorial - part 1 link Video tutorial - part 2 link Video tutorial - part 3 link Video tutorial - part 4 link
    • 17:00 - 18:30 The gLibrary framework 1h30'
      In this presentation, we introduce gLibrary 2.0, a platform that permits to create REST APIs over existing databases or new datasets. It supports both relational and non-relational (i.e. schema-less) datasets. It also provides data storage services to Grid and Cloud (OpenStack-based) Storage Servers. After a general overview and the architecture, we will show live how to create a new repository, importing data collections from an existing database, creating new collections from scratch, make queries and use replicas/attachments to handle file transfers.
      Speaker: Antonio Calanducci (INFN)
      Material: Slides link Tutorial link Video lecture link Video live demo link
  • Tuesday, 19 July 2016
    • 08:30 - 09:00 Registration and badging of participants 30'
    • 09:00 - 10:00 Programmatic interaction with Open Access Repositories
      In this section we will introduce the concept of Digital Asset Management System and talk about the programmatic interaction with Open Access Repositories (based on Invenio). Thenl, we will show how submit different types of resource manually through the repository. After that, we will start with the programmatic interaction with Open Access Repository through the use of APIs for data searching, downloading and uploading. We will have a brief look at the MARCXML tags.  At the end, we will see how to interact with the Open Access Repository using the OAI-PMH standard protocol and how to provide authorship to research products stored on an Open Access Repository.
      Conveners: Roberto Barbera (University of Catania), Carla Carrubba (University of Catania)
      Material: Slides link Video lecture - part 1 link Video lecture - part 2 link Video lecture - part 3 link XML exemplar file to upload contents on the OAR link
    • 10:00 - 11:00 The Onedata platform
      In this session we will give an overview of Onedata concepts such as spaces, user groups and providers. We will then discuss onedata system's internal architecture with a focus on scalability, fault tolerance and remote data access. onedata's implementation of CDMI protocol will be briefly discussed along with features for metadata management. 
      Conveners: Krzysztof Trzepla (CYFRONET), Konrad Zemek (CYFRONET)
      Material: Slides linkdown arrow Video lecture - part 1 link Video lecture - part 2 link Video lecture - part 3 link Video lecture - part 4 link
    • 11:00 - 11:30 Coffee break
    • 11:30 - 13:00 The Onedata platform - continued
      To conclude the session we will present broadly some of our plans for the close future, focusing on Opendata integration in the system. During the presentation we will hold a live demo of Onedata followed by a hands-on session for the audience.
      Conveners: Krzysztof Trzepla (CYFRONET), Konrad Zemek (CYFRONET)
    • 13:00 - 14:00 Lunch break
    • 14:00 - 16:00 The Ophidia platform
      The Ophidia project is a research effort on big data analytics facing scientific data analysis challenges in the climate change domain. Ophidia provides declarative, server-side, and parallel data analysis, jointly with an internal storage model able to efficiently deal with multidimensional data and a hierarchical data organization to manage large data volumes (“datacubes”). The project relies on a strong background on high performance database management and OLAP systems to manage large scientific datasets. The Ophidia analytics platform provides several data operators to manipulate datacubes, and array-based primitives to perform data analysis on large scientific data arrays. Metadata management support is also provided. The server front-end exposes several interfaces to address interoperability requirements: WS-I+, GSI/VOMS and OGC-WPS (through PyWPS). From a programmatic point of view a Python module (PyOphidia) makes straightforward the integration of Ophidia into Python-based environments and applications (e.g. IPython). The system offers a CLI (e.g. bash-like) with a complete set of commands. A key point of the talk will be the workflow capabilities offered by Ophidia. In this regard, the framework stack includes an internal workflow management system, which coordinates, orchestrates, and optimises the execution of multiple scientific data analytics & visualization tasks. Specific macros are also available to implement loops, or to parallelize them in case of data independence. Real-time workflow monitoring execution is also supported through a graphical user interface. Some real workflows implemented at CMCC and related to different EU projects will be also presented.
      Convener: Alessandro D'Anca (CMCC)
      Material: Slides linkdown arrow Video lecture - part 1 link Video lecture - part 2 link Video lecture - part 3 link
    • 16:00 - 16:30 Coffee break
    • 16:30 - 18:00 The Kepler workflow manager
      This session involves examination of Kepler as a tool for building scientific workflows. Emphases are on development of basic skills that will allow attendees to get familiar with the process of building workflows. We will cover topics covering simple tasks and present how to express typical programming constructs in workflow based environment. We will discuss simple workflows, composite actors, ways of switching data flow, building loops, and calling Python code directly from the workflow. This session will provide students with general "feel" of Kepler workflow management system (https://kepler-project.org). #### Organization
      
      This session is a tutorial based, hands on session. Students are required to take active part during the training. All materials will be available and each task will be explained with required level of details. In case students will face issues while following tasks, they are encouraged to raise their doubts. #### Session objectives
      
      1. to describe basics of Kepler workflow management system 2. to introduce students to the workflow based computations 3. to introduce students to the process of building workflows using Kepler 4. to introduce students to more complex topics: loops, Python execution.
      Convener: Michal Owsiak (PSNC)
      Material: Slides link Tutorial link Video lecture link Video live demo link
  • Wednesday, 20 July 2016
    • 10:00 - 12:00 Day 3 - Presentation of use cases and their implementation stategies
      • 10:00 WIMEA–ICT: Science Gateway for Weather Information Management in East Africa to interact with ICT Tool WRF 30'
        Accessing and interacting with different applications/tools running on remote High Performance Computing (HPC) facilities is a challenge to most of researchers, scientists and students particularly in East Africa when it come that there is no graphical user interface (GUI). Many users are not familiar with Linux environment commands instead they demand to use GUI over windows machine in which again most scientific applications cant be deployed like in Linux machine. Weather Research and Forecasting (WRF) is a selected tool in the project improving Weather Information Management in East Africa through ICT tools. This use case is targeting to implement the Science Gateway (web portal) for easiness interacting remotely with WRF tool on HPC. The development will be on integrated open source tools: Future Gateway (FG) and Application Programming Interfaces (APIs). 
        Speaker: Damas Makweba (Dar es Salaam Institute of Technology - Tanzania)
        Material: Slides powerpoint file
      • 10:30 Public Health Gateway in Kenya 30'
        A gateway for public health experts to publish content addressing issues to do with prevalence of fatalities arising from motorcycle accidents in Kenya is urgently needed. Whereas these fatalities are many, and frankly, avoidable, the persistence of this problem is worrying, and is a drain to the economy, and a real problem to families. 
        There are other severe concerns such as data analysis on immunisation and the effects of not taking up immunisation of children. All these societal concerns may have a wealth of information that can be disseminated for public consumption, with the potential to change things for the better, yet this is far from achieved in Kenya. The gateway will use a virtual storage model where the hypervisor provides an emulated hardware for each hardware environment for each virtual machine, including computer, memory and storage.
        Speakers: Dennis Muoki Kimego (Egerton University - Kenya), Charles Muiruri Njaramba (Egerton University - Kenya)
        Material: Slides powerpoint file
      • 11:00 Brainstorming on the implementation of the use cases 1h0'
    • 12:00 - 19:00 Day 3 - Code development for use cases implementation
  • Thursday, 21 July 2016
    • 09:00 - 19:00 Day 4 - Code development for use cases implementation
  • Friday, 22 July 2016
    • 09:00 - 19:00 Day 5 - Code development for use cases implementation
  • Saturday, 23 July 2016
    • 09:00 - 19:00 Day 6 - Code development for use cases implementation
  • Sunday, 24 July 2016
    • 09:00 - 19:00 Day 7 - Free day
      This day is free. Excursions will be organised.
  • Monday, 25 July 2016
    • 09:00 - 19:00 Day 8 - Code development for use cases implementation
  • Tuesday, 26 July 2016
    • 09:00 - 19:00 Day 9 - Code development for use cases implementation
  • Wednesday, 27 July 2016
    • 09:00 - 19:00 Day 10 - Code development for use cases implementation
  • Thursday, 28 July 2016
    • 09:00 - 19:00 Day 11 - Code development for use cases implementation
  • Friday, 29 July 2016
    • 09:00 - 15:00 Day 12 - Code development for use cases implementation
    • 15:00 - 18:00 Day 12 - Use cases final presentations and wrap-up
      • 15:00 WIMEA–ICT: Science Gateway for Weather Information Management in East Africa to interact with ICT Tool WRF - Final report 30'
        Speaker: Damas Makweba (TERNET - Tanzania)
        Material: Slides pdf file
      • 15:30 Public Health Gateway in Kenya - Final report 30'
        Speakers: Dennis Muoki Kimego (Egerton University - Kenya), Charles Muiruri Njaramba (Egerton University - Kenya)
        Material: Slides powerpoint file
      • 16:00 Wrap-up and closure 30'