ModelHub: Lifecycle Management for Deep Learning

Abstract

Deep learning has improved state-of-the-art results in many important fields, and has been the subject of much research in recent years, leading to the development of several systems for facili- tating deep learning. Current systems, however, mainly focus on model building and training phases, while the issues of data management, model sharing, and lifecycle management are largely ignored. Deep learning modeling lifecycle contains a rich set of artiacts, such as learned parameters and training logs, and frequently conducted tasks, e.g., to understand the model behaviors and to try out new models. Dealing with such artifacts and tasks is cumbersome and left to the users. To address these issues in a comprehensive manner, we propose ModelHub, which includes a novel model versioning system (dlv); a domain specific language for searching through model space (DQL); and a hosted service (ModelHub) to store developed models, explore existing models, enumerate new models and share models with others.

Introduction

QQ20170919-221253@2x

The above model lifecycle demostrate the following challenges:

  • It is difficult to keep track of the many models developed and/or understand the differences amongst them.
  • The development lifecycle itself has time-consuming repetitive sub-steps, such as adding a layer at different places to adjust a model, searching through a set of hyper-parameters for the different variations, reusing learned weights to train models, etc.
  • The storage footprint of deep learning models tends to be very large.
  • Sharing and reusing models is not easy,

To solve this, ModelHub

  • a model versioning system (DLV) to store and query the models and their ver- sions.
  • a model enumeration and hyper-parameter tuning domain specific language (DQL)
  • a hosted deep learning model sharing system (ModelHub) to publish, discover and reuse models from others.

System Architecture

System Architecture

QQ20170919-222245@2x

Data Model

ModelHub works on two levels of data models: conceptual DNN model, and data model for the model versions in the DLV repository.

  • DNN Model: DAG of deep learning weights & bias
  • VCS Data Model: It consists of a network definition, weights, extracted metadata, files used to-
    gether with the model instance.
    • In the implementation, model versions can be viewed as a relation of M(name, id, N, W, M, F)

Query Facilities

Model Exploration Queries

  • users use this query to understand a particular model, query lineages of the models, and compare several models.

    Model Enumeration Queries

  • explore variations of cur- rently available models in a repository.
    • Select models to improve
    • Slice particular models
    • Construct new models
    • Try the new models on different hyper-parameters
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Query 1: DQL select query to pick the models.
select m1
where
m1.name like " alexnet_%" and
m1.creation_time > " 2015-11-22" and
m1[" conv[1,3,5]" ].next has POOL(" MAX" )
Query 2: DQL slice query to get a sub-network.
slice m2 from m1
where m1.name like " alexnet -origin%"
mutate m2.input = m1[" conv1" ] and
m2.output = m1[" fc7" ]
Query 3: DQL construct query to derive more models on existing ones.
construct m2 from m1
where
m1.name like " alexnet -avgv1%" and
m1[" conv*($1)" ].next has POOL(" AVG" )
mutate m1[" conv*($1)" ].insert = RELU(" relu$1" )
Query 4: DQL evaluate query to enumerate models with different net- work definitions, search hyper-parameters, and eliminate models.
evaluate m
from " query3"
with config = " path to config"
vary config.base_lr in [0.1, 0.01, 0.001] and
config.net[" conv*" ].lr auto and
config.input_data in [" path1" , " path2" ]
keep top(5, m[" loss" ], 100)
Share Comments