Model Based Reinforcement Learning

Let me start with a quick introduction of myself. I am a combination of an Electrical/Mechanical/Automotive/Control Engineer and an applied mathematician. I know a thing or two about numerical algorithms and convex optimization. I started working with MATLAB in 1994. I was an Assistant Professor for three years. I used LEGO Mindstorms to teach Mechatronics to undergraduate and graduate students. I worked as a Senior Research Engineer at Maplesoft. I therefore became quite familiar with symbolic methods, automatic differentiation, acausal modeling, algorithm optimization and automatic code generation. During my time at Maplesoft, I helped Japanese car manufacturers, including Toyota, use model based methods like model predictive control (MPC) to automatically generate C code for automotive embedded control systems.

The workflow that I usually follow for real-world problems is as follows:

  • use first principles to make a mathematical model for the target system (white box modeling)
  • use generic functions like polynomials, radial basis functions or neural networks to create models for the unknown components of the system (black box modeling)
  • use measured data to estimate the parameters of the grey box model (combination of white and black box models).
  • define an optimization problem (a cost/loss function and constraints) to describe the goals and the way they should be reached
  • develop an algorithm to solve the optimization problem
  • this workflow is very general. It can describe MPC, or model based reinforcement learning as it is called in the AL/ML literature. It can also describe a wide range of AL/ML methods like regression or classification using traditional ML methods or deep learning.

One of the most interesting aspects of the AI/ML ecosystem is that the computations are mainly done in the cloud. Traditional tools for high-level algorithm design like MATLAB and Maple were designed more than 30 years ago. At the time, most tools were meant for personal computers. TensorFlow can be seen as the MATLAB/Maple of our time. TensorFlow has been optimized to perform linear algebra operations on CPU’s or GPU’s in the cloud. Currently, the main application of TensorFlow is to create and train deep learning models. I think TensorFlow can be employed to solve a wide range real-world problems. The existing tools for these problems run on desktop computers.

I am very interested in developing solutions to real-word problems employing grey box modeling and model based methods and cloud computing. To make it more clear, when I say model based methods, I don’t mean neural networks. Neural networks like polynomials, splines and radial basis functions are black box models in the sense that we assume we know nothing about the real system. We then try to identify/estimate the parameters of the model so that it mimics the behaviour of the real system. On the opposite side of the spectrum are white box models based on first principles like the dynamic equations of a mobile robot or robotic manipulator. We assume we know the exact equations that govern the system. Anywhere else in this spectrum of models, we have grey box models for which we include what we know about the system in the model and create the rest of the model using generic functions or dynamics systems like splines or radial basis functions or different kinds of neural networks. With grey box modeling, I think there is a higher chance for finding a solution for many industrial automation problems or solutions that are much faster and cheaper than black box approaches.

In search of applications of model based methods, cloud computing and the Internet of Things, I am looking into the following potential target markets:

  • automotive industry: autonomous driving, automotive control systems, predictive maintenance, prognostics, digital twins
  • industrial automation: digital twins, predictive maintenance, data analytics
  • building automation: building management systems, digital twins, minimizing electric power, air conditioning and lighting costs
  • robotics: digital twins (virtual reality), autonomous robots, path planning, SLAM, vision

In addition, I am interested in learning more about deep learning, reinforcement learning, generative adversarial networks and other modern methods. Finding out how these methods are related to well-known model based control design methods and convex optimization is a very interesting field of research.

If you have a problem in this context or are interested to talk about model based approaches, drop me a line on LinkedIn.

Behzad Samadi
Behzad Samadi
Director of Innovation and Engagement

My research interests include connected autonomous vehicles, edge computing and convex optimization.