facebook tracking

Master Thesis - Risk-Sensitive Decision-Making for Autonomous Driving


For sequential decision-making tasks such as autonomous driving, it is imperative that the decision-maker takes into account the uncertainty in performance that arises due to imperfect learning and/or noisy outcomes. In risk-sensitive decision-making not only can the full distribution of outcomes be considered during learning but allows for a more apt specification of the goal of the agent rather than simply maximizing for the expected performance.

In recent years, estimating the whole distribution of outcomes for RL has seen some revival, such as [1], where the return distribution is estimated using a categorical distribution. Given the distribution, one can choose to optimize for another target, such as one maximizing for expected return but with a penalty on the variance of the return. Historically, studies have taken into account the distribution of returns using things such as entropic risk-measures [2], focusing on the uncertainty that arises due to the inherent stochasticity of the model and the policy. This is termed aleatory uncertainty. Recently [3] people have been considering not only the aleatory uncertainty but also the parametric uncertainty that arises due to the model being unknown. This is commonly termed epistemic uncertainty and will always be present unless the model is fully known.

Conformal predictors [4] can be used to predict not only point estimates but range estimates which is useful in tasks considering robustness or safety. The modularity of the conformal predictor framework allows for its use in not only simple models but also for large parameter models such as neural networks, as demonstrated in [5]. In this work, conformal predictors would be used to design risk-sensitive agents for autonomous driving by taking aleatory and epistemic uncertainty into account during training and control.

[1] – Marc G Bellemare, Will Dabney, and R´emi Munos. A distributional perspective on reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 449–458. JMLR. org, 2017

[2] – O. Mihatsch and R. Neuneier. Risk-sensitive reinforcement learning. Machine learning, 49 (2-3):267–290, 2002.

[3] – Hannes Eriksson and Christos Dimitrakakis. Epistemic risk-sensitive reinforcement learning. In ESANN, pages 339–344, 2020.

[4] — Shafer, Glenn, and Vladimir Vovk. "A Tutorial on Conformal Prediction." Journal of Machine Learning Research 9.3 (2008).

[5] — Romano, Yaniv, Evan Patterson, and Emmanuel Candes. "Conformalized quantile regression." Advances in Neural Information Processing Systems 32 (2019): 3543-3553.

Project Description

In this master thesis project, you will focus on:

  • Agents for control in decision-making problems using visual or object data for autonomous driving (such as TORCS, highway, CARLA)
  • Devising and quantifying risk-measures for RL agents
  • Evaluating the risk-sensitive agents on real-world data (such as internal Zenseact data or highD)


We are looking for (one or two) students with a background in machine learning and decision-making with some level of mathematical maturity. Courses in reinforcement learning, deep learning, and probability theory are helpful.

Further information

Please send in individual applications with CV, motivational letter and grade transcripts.

Planned start: January 2022, with some flexibility.

Final application date: 15 of November 2021

Duration: 30 ECTS

For questions regarding the project, please contact: mina.alibeigi@zenseact.com, or hannes.eriksson@zenseact.com

Additional information

  • Remote status

    Fully remote

Or, know someone who would be a perfect fit? Let them know!

Gothenburg, Sweden

Lindholmspiren 2
417 56 Göteborg Directions View page

Making safe and intelligent mobility real.

At Zenseact, we lead the global movement of crafting tomorrow's mobility with the software platform of choice. Our mission is to “Make safe and intelligent mobility real, for everyone, everywhere”. This statement marks our conviction and dedication to bring autonomous driving out on the streets for real and is at the center of everything we do.

We could not dream of achieving this without our great teams of very talented people. We are on this journey together and our agile way of working is reflected throughout our entire organization; it is part of our culture and how we work, develop and grow together.


Applicant tracking system by Teamtailor