BASIC INFORMATION
Short name: DDLP
Long name: Distributed Deep Learning Platform
Company: MaLe Labs sp. z o.o.
Country: Poland
Call: F4Fp-01-S (see call details)
Proposal number: F4Fp-01-08-S
SUMMARY REMARKS & TESTBEDS
Recent advancements in the field of machine learning have led to a significant increase in number of practical applications of such methods in various business areas, like financial analysis, text and speech recognition, object classification and detection, automated image colouring or medical image analysis and drug discovery. In many practical applications, especially utilizing deep learning techniques, data scientists responsible for development of machine learning algorithms have to process big data sets in order to effectively train algorithms. Due to this fact, the learning process can take days, weeks or even months, therefore, it can be longer than the time required for development of actual algorithm itself. Moreover, the learning process requires multiple iterations of the training, because machine learning algorithms have to be tested and adjusted in order to achieve required level of accuracy, while preventing the algorithm from overfitting. The main aim of this project is to evaluate and validate our newly developed Docker-based tools for automation and on demand deployment of distributed deep learning algorithms on virtualized infrastructures. This solution is a result of our recent R&D works and is aimed at shortening deep learning experiment deployment time and facilitation of distributed deep learning algorithms deployment and execution by data scientists, who are mostly not experts in systems administration and distributed computing. During this project we will utilize distributed versions of our deep learning algorithms supporting drug discovery and we will measure their performance and the reduction of training time resulting from distributed computing. Also, during the experiment we are planning to perform an in depth measurements of computing and communication network resource usage, in order to determine, how large scale distributed deep learning algorithms should be designed and implemented in order to minimize computing and network bottlenecks.