International Workshop on Advanced Data Systems Management, Engineering, and Analytics (MegaData)

Tartu, Estonia August 24, 2021

The International Workshop on advanced data systems management, engineering, and analytics (MegaData), held in conjunction with the 25th European Conference on Advances in Databases and Information Systems (ADBIS’2021). MegaData provides vibrant opportunities for academic researchers and industry practitioners to share their research experiences, original research achievements, and practical development products on Big Data management and engineering.

MegaData Overview

Under the exponential growth of Big Data (BD) from different sources, managing, engineering, and designing Data systems, gaining meaningful insights is a significant challenge. The current generation of data engineers and architects works tirelessly to satisfy the accelerating demand for data-driven innovations. Questions like: How to thrive data as the foundation for advanced Databases and Information Systems? And what will the next generation of Data systems look like? Will lead the discussion on the latest trends in modern data systems. MegaData workshop aims to report on the advances and trends in BD deployment models and environments from both the infrastructure and application levels. Papers presenting recent results, research issues, practical applications, case studies, and industrial implementations are welcome. Moreover, the submission of ongoing research, position, visionary, and student papers are encouraged to fuel up the discussion.

MegaData Aims

Data is growing explosively, and several systems have emerged to store, process, and analyze such large-scale amounts of data. These “Big data systems” are fast evolving to meet the practitioners’ demand from both industry and academia alike. Examples include the NoSQL systems, Hadoop stack, Apache Spark, data analytics platforms, search and indexing platforms, and deployment infrastructures. These systems address needs for structured and unstructured data across a wide spectrum of domains and applications ranging from NoSQL and batch processing to micro-batch processing and stream data processing frameworks.

The MegaData workshop’s objective is to bring together researchers, practitioners, system administrators, system programmers, and others interested in sharing and presenting their perspectives on the effective management of big data systems. The focus of the workshop is on a novel and practical, systems-oriented work. MegaData offers an opportunity to showcase the latest advances in this area and discuss and identify future directions and challenges in management and engineering of big data systems.

Topics

Papers are solicited on all aspects of big data management. Specific topics of interest include, but are not limited, to the following:

Resource management and scheduling mechanisms
System tuning/auto-tuning and configuration management
Auto Scaling and elastic scaling challenges and opportunities
Unified management of ‘data in motion’ and ‘data at rest.’
Dealing with both structured and unstructured data
Holistic management across hardware and software
Emerging hardware/software technologies such as shared memory, hyperthreading, and
Domain-specific challenges in the cloud, sensor networks, streaming analytics, cyber-physical systems
Emerging deployment models in IoT, IoT-to-Cloud, Edge/fog deployment, HPC
Scalable architectures for data storage, archival, and virtualization
Performance benchmarking and workload studies
Advances in data storage models, including object stores and key-value stores
Techniques for data integrity, availability, reliability, and fault tolerance
Productivity tools for data-intensive computing, data mining, and knowledge discovery
Application of emerging big data frameworks towards scientific computing and analysis
Enabling cloud and container-based models for scientific data analysis
Tools and techniques for managing data movement among computing and data-intensive components

Program Committee Members (to be completed)

Pablo Rodríguez-Mier, INRAE, France
Victor M. Muñoz , Universitat Oberta de Catalunya, Spain
Manisha Sirsat, INESC, Portugal
Arturo Gonzalez-Escribano, Universiadad de Valladolid, Spain
James Benson University of Texas at San Antonio, USA
Rosa Filgueira, EPCC, The University of Edinburgh, UK
Imed Romdhani, Edinburgh Napier University, UK
Sattam Almatarneh, Middle East University, Jordan.
Said Alawadi, Uppsala University, Sweden.
Syed Attique Shah, University of Tartu, Estonia
Pablo Caderno, University of Santiago de Compostela, Spain
Ahmad Aburomman, University de A Crouna, Spain
Maanak Gupta Assistant Professor, Tennessee Technological University, USA
Mehdi Gheisari Guangzhou University, China
Mohamed Ragab, Tartu University, Estonia
Houshyar Honar Pajooh, Masey University, New Zealand
Xoan C. Pardo, Universidade da Coruña, Spain
Jose R.R. Viqueira, Universidade de Santiago de Compostela, Spain

Publication

Workshop papers will be published in the Springer Communications in Computer and Information Science (CCIS) series. The best workshop papers will be invited to a special issue of the journal: Computer Science and Information Systems (ComSIS)

Important dates

Paper submission deadline: April 9, 2021, April 20, 2021
Notification of acceptance: May 14, 2021
Camera-ready due: June 11, 2021
Workshop day: August 24, 2021

Paper Submission

Authors are invited to submit Short papers (up to 10 pages) and regular papers (up to 14 pages). Papers must be submitted via Easy Chair.

Select make a new submission then select “Advances in Data Systems Management, Engineering, and Analytics” from the tracks list.

Templates, sample files, and useful links can be found in the LaTeX and Word files for frontmatter (zip) section.

Authors should consult Springer’s authors’ guidelines and use their proceedings templates, either for LaTeX or for Word, to prepare their papers. Springer encourages authors to include their ORCIDs in their papers. The corresponding author of each paper, acting on behalf of all of the authors of that paper, must complete and sign a Consent-to-Publish form.

The corresponding author signing the copyright form should match the corresponding author marked on the paper. Once the files have been sent to Springer, changes relating to the papers’ authorship cannot be made.

Invited Talk on Machine Learning Pipelines

Dr Srijith Rajamohan,

Senior Data Science Developer Advocate at Databricks

Databricks is an enterprise software company founded by the original creators of Apache Spark. The company has also created DeltaLake,
MLflow and Koalas, popular open-source projects that span data engineering, data science and machine learning. The talk will focus mainly 
on new advancements in ML pipelines at Databricks; challenges and opportunities.

Workshop Co-chairs:

Yaser Jararweh, Associated Professor of Computer Science. Duquesne University, USA
Tomás F. Pena, Associated Professor. University of Santiago de Compostela, CiTIUS research center, Spain
Feras M. Awaysheh, Assistant Professor of Big Data Systems, Delta research center, University of Tartu, Estonia