Abstract
During early optimization passes, compilers must make predictions for machine-dependent characteristics such as execution unit utilization, number of register spills, latency, throughput etc. to generate better code. Often a hand-written static/analytical hardware cost model is built into the compiler. However, the need for more sophisticated and varied predictions has become more pronounced with the development of deep learning compilers which need to optimize <PRE_TAG><PRE_TAG>dataflow graphs</POST_TAG></POST_TAG>. Such compilers usually employ a much higher level <PRE_TAG><PRE_TAG>MLIR</POST_TAG></POST_TAG> form as an IR representation before lowering to traditional <PRE_TAG><PRE_TAG>LLVM-IR</POST_TAG></POST_TAG>. A static/analytical cost model in such a scenario is cumbersome and error prone as the opcodes represent very high level algebraic/arithmetic operations. Hence, we develop a machine learning-based cost model for high-level <PRE_TAG><PRE_TAG>MLIR</POST_TAG></POST_TAG> which can predict different target variables of interest such as CPU/GPU/xPU utilization, instructions executed, register usage etc. By considering the incoming <PRE_TAG><PRE_TAG>MLIR</POST_TAG></POST_TAG> as a text input a la <PRE_TAG><PRE_TAG>NLP models</POST_TAG></POST_TAG> we can apply well-known techniques from modern NLP research to help predict <PRE_TAG>hardware characteristics</POST_TAG> more accurately. We expect such precise ML-driven <PRE_TAG><PRE_TAG>hardware cost models</POST_TAG></POST_TAG> to guide our deep learning compiler in graph level optimizations around <PRE_TAG><PRE_TAG><PRE_TAG>operator fusion</POST_TAG></POST_TAG></POST_TAG>, <PRE_TAG><PRE_TAG>local memory allocation</POST_TAG></POST_TAG>, <PRE_TAG><PRE_TAG>kernel scheduling</POST_TAG></POST_TAG> etc. as well as in many kernel-level optimizations such as <PRE_TAG><PRE_TAG><PRE_TAG>loop interchange</POST_TAG></POST_TAG></POST_TAG>, <PRE_TAG><PRE_TAG><PRE_TAG>LICM</POST_TAG></POST_TAG></POST_TAG> and <PRE_TAG><PRE_TAG><PRE_TAG>unroll</POST_TAG></POST_TAG></POST_TAG>. We report early work-in -progress results of developing such models on high-level <PRE_TAG><PRE_TAG>MLIR</POST_TAG></POST_TAG> representing <PRE_TAG><PRE_TAG>dataflow graphs</POST_TAG></POST_TAG> emitted by <PRE_TAG>Pytorch</POST_TAG>/<PRE_TAG>Tensorflow-like frameworks</POST_TAG> as well as lower-level dialects like affine. We show that these models can provide reasonably good estimates with low error bounds for various hardware characteristics of interest and can be a go-to mechanism for hardware cost modelling in the future.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper