ارائه سینا شیخ الاسلامی با موضوع "parallel Ablation Studies for Deep Learning"

Mahdi Esmailoghli
PhD Student at Technische Universität Berlin

Explanation of Air Pollution Using External Data Sources: A Case Study

برگزار کنندگان ایونت :‌ انجمن علمی دانشکده مهندسی کامپیوتر پلی تکنیک

.Data scientists spend 80% of their time on pre-processing tasks, such as feature extraction, and just 20% on training the machine learning (ML) mode

While there is already a body of research on how to extract features from a given dataset a rather neglected problem is the availability of useful features in the first place. Oftentimes, the data scientists have to enrich their datasets with more features from other sources to obtain reasonable prediction models

”.In this presentation, we will talk about a very important use case of dataset enrichment problem, “Explanation of Air Pollution

High concentrations of fine-grained particles in the air can adversely affect human health. To control it, the European Union has undertaken several strategies, such as the introduction of certain particle concentration thresholds allowed in populated areas, or limitations for vehicle access. However, many cities in Germany are unable to follow this legislation and control the particle emission because it is hard to attribute the pollution to a clear source. Therefore, it is important to understand the dynamic process of fine-grained particle distribution and the reasons the emission occurs. In this talk, we aim to present the design of a system that provides the human analyst with explanations about polluted areas within a city, and potential causes

Sina Sheikholeslami
PhD Student at KTH Royal Institute of Technology

Parallel Ablation Studies for Deep Learning

Ablation studies have become best practice in machine learning research as they provide insights into the relative contribution of the different architectural and regularization components to the performance of models.  However, as deep learning architectures become ever deeper and data sizes keep growing, we have an explosion in the number of different architecture combinations that need to be evaluated to understand their relative performance. In this talk, we introduce a new Apache PySpark-based framework for the design and parallel execution of ablation studies on any python-based machine learning framework. We introduce a declarative way for defining an ablation study and then generate the model architecture constructions and data ingestion logic based on the study specification. An ablation study, consisting of many trials, can then be executed in parallel on multiple workers and GPUs. Our ablation study framework builds on Maggy, an open-source framework for asynchronous parallel execution of trials on PySpark.

تاریخ ایونت

14 دی ماه ۱۳۹۸، ساعت 13:30

مکان ایونت

آمفی‌تئاتر دانشکده مهندسی کامپیوتر

قیمت ایونت

رایگان

عکس های رویداد

1 نظر

به گفتگو بپوندید و نظر خود را به ما بگویید.

AffiliateLabz
26 بهمن 1398 در 23:54

Great content! Super high-quality! Keep it up! 🙂