Deep Neural Networks (DNN) are increasingly deployed in highly resource-constrained environments such as autonomous drones and wearable devices, which have specific energy budgets and/or real-time requirements. While lots of recent work has studied empirical techniques to reduce the energy consumption and latency of DNNs, our research is the first to propose an end-to-end DNN training framework that provides **quantitative resource guarantees**.

Without losing generality, we focus on the energy constraints. Specifically, our learning algorithm directly trains a DNN model that meets a given energy budget while maximizing model accuracy without incremental hyper-parameter tuning. Our training algorithm leverages the network sparsity as the knob to control network energy consumption, but in principle supports other techniques such as quantization.

The **key idea** is to formulate the DNN training process as a constrained optimization problem in which the energy budget imposes a previously unconsidered optimization constraint. Formally, the optimization problem is formulated as follows:

Crucially, our technique works for both platforms where the hardware architecture details are known and where the hardware architectures are closed and have to be treated as blackboxes.

**Our ICLR'19 work** addresses the former case, in which the energy consumption of a DNN inference could be analytical modeled as a function of the network sparsity. Given the energy model, we propose an optimization algorithm to approximately solve the optimization problem. A key step in optimization is the projection operation onto the energy constraint. We prove that this projection can be casted into a 0/1 knapsack problem and show that it can be solved very efficiently.

Under the same energy budget, our approach achieves noticeably higher accuracy than the previously best platform-aware learning algorithm such as EAP.

**Our preprint on arXiv** proposes a framework called ECC that addresses the latter case. ECC has two phases: an offline energy modeling phase and an online training/compression phase.
Given a particular network to compress, the offline component profiles the network on a particular
target platform and constructs an energy estimation model without requiring any knowledge of the underlying hardware platform. The online component leverages the
energy model to solve the constrained optimization problem followed by an optional fine-tuning
phase before generating a compressed model.

The optimization problem, however, has nontrivial constraints. Therefore, existing deep learning solvers do not apply directly. We propose an optimization algorithm that combines the essence of the Alternating Direction Method of Multipliers (ADMM) framework with gradient-based learning algorithms.

Under the same energy budget, our approach achieves **17.5%** and **43.4%** higher accuracy than NetAdapt and AMC, respectively, which are the two previously best platform-agnostic learning algorithm.