Valentin Thomas

I am a Senior Machine Learning Scientist at Layer6 where I mainly work of foundation models for tabular data and time series. I previously graduated from my PhD at Mila, where I worked on Reinforcement Learning and Deep Learning under supervision of Yoshua Bengio and Nicolas Le Roux.

Email / CV / Bio / Google Scholar / Twitter / Github

Research

I'm interested in reinforcement learning, deep learning and optimization. I have worked on unsupervised RL, planning, generalization in deep learning and optimization aspects of reinforcement learning. Representative papers are highlighted. Stars * indicate first authorship.

	TabDPT: Scaling Tabular Foundation Models Valentin Thomas, Junwei Ma, Rasa Hosseinzadeh, Hamidreza Kamkari, Alex Labach, Jesse C. Cresswell, Keyvan Golestan, Guangwei Yu, Maksims Volkovs, Anthony Caterini We introduce TabDPT, a tabular foundation model capable of providing predictions for unseen tabular datasets with no further training or hyperparameter tuning, and demonstrate scaling in both model and dataset size.
	Retrieval and Fine-tuning for In-Context Tabular Models Valentin Thomas, Junwei Ma, Rasa Hosseinzadeh, Keyvan Golestan, Guangwei Yu, Maksims Volkovs, Anthony Caterini NeurIPS 2024 ICML 2024 Workshop on In-Context Learning We use retrieval with fine-tuning to improve in-context learning on tabular data and show large improvement with better scaling.
	In-Context Data Distillation with TabPFN Junwei Ma, Valentin Thomas, Guangwei Yu, Anthony Caterini ICLR 2024 Workshop on Mathematical and Empirical Understanding of Foundation Models (ME-FoMo) We propose a simple method similar to prompt tuning to allow in-context foundation tabular models to scale efficient with the dataset size.
	Bridging the Gap Between Target Networks and Functional Regularization Alexandre Piché, Valentin Thomas, Joseph Marino, Rafael Pardinas, Gian Maria Marconi, Christopher Pal, Mohammad Emtiyaz Khan TMLR 2023 NeurIPS 2021 DeepRL workshop We analyze the implicit regularization performed by using Target Networks and show that, surprisingly, it can unstabilize TD. We propose a theoretically grounded alternative method, Functional Regularization, which alleviates these theoretical issues and performs well empirically.
	On the role of overparameterization in off-policy Temporal Difference learning with linear function approximation Valentin Thomas* NeurIPS 2022 We study the role of overparameterization in Temporal Difference (TD) learning and how it affects optimization. For this, we analyze the spectrum of the Temporal Difference operator when using random features and with some assumptions of the Markov transition kernel.
	The Role of Baselines in Policy Optimization Jincheng Mei, Wesley Chung, Valentin Thomas*, Bo Dai, Csaba Szepesvari, Dale Schuurmans NeurIPS 2022 Using value function baselines in on-policy stochastic natural policy gradients help achieve convergence toward globally optimal policy by reducing update aggressiveness rather than variance.
	Beyond variance reduction: Understanding the true impact of baselines on policy optimization Valentin Thomas, Wesley Chung, Marlos C. Machado, Nicolas Le Roux ICML 2021 blog post/ICML talk We show empirically and theoretically that despite common wisdom, baselines in policy gradient optimization have an effect beyond variance reduction and can impact convergence.
	On the interplay between noise and curvature and its effect on optimization and generalization Valentin Thomas*, Fabian Pedregosa, Bart van Merriënboer, Pierre-Antoine Mangazol, Yoshua Bengio, Nicolas Le Roux AISTATS 2020 Oral talk at the 2020 Workshop on theory of deep learning at the Institute for Advanced Studies, Princeton AISTATS talk We show how the interplay between the local curvature of the loss (the hessian) and the local gradient noise (the uncentered gradient covariance) can impact optimization and generalization in neural networks.
	Planning with Latent Simulated Trajectories Alexandre Piché, Valentin Thomas, Cyril Ibrahim, Yoshua Bengio, Julien Cornebise and Chris Pal ICLR 2019 Workshop on Structure and Priors in Reinforcement Learning Extension of our work "Probabilistic Planning with Sequential Monte Carlo methods" by treating the trajectory as a latent variable and using an EM algorithm.
	Probabilistic Planning with Sequential Monte Carlo methods Valentin Thomas, Alexandre Piché, Cyril Ibrahim, Yoshua Bengio and Chris Pal ICLR 2019 Contributed talk at NeurIPS 2018 workshop Infer to Control Leveraging control as inference and Sequential Monte Carlo methods, we propose a probabilistic planning algorithm.
	Disentangling the independently controllable factors of variation by interacting with the world Valentin Thomas, Emmanuel Bengio, William Fedus, Jules Pondard, Philippe Beaudoin, Hugo Larochelle, Joelle Pineau, Doina Precup and Yoshua Bengio Oral at NeurIPS 2017 workshop on Learning Disentangled Representations: from Perception to Control We draw a connection between mutual information and the intrinsic reward function (through the Donsker-Varadhan representation of the Kullback-Leibler divergence) used for learning jointly options/factors and latent representations in Independently Controllable Factors*.
	Independently Controllable Factors Valentin Thomas, Jules Pondard, Emmanuel Bengio, Marc Sarfati, Philippe Beaudoin, Marie-Jean Meurs, Joelle Pineau, Doina Precup, Yoshua Bengio Presented at the Montreal AI Symposium 2017 This work is a finalized version of Independently Controllable Features* where the policies and factors are now embedded in a contiuous space. We demonstrate how one can use the features learnt.
	Independently Controllable Features Emmanuel Bengio, Valentin Thomas, Joelle Pineau, Doina Precup, Yoshua Bengio RLDM 2017 We propose a way to learn jointly a set of discrete policies each affecting a component of the latent state representation for unsupervised reinforcement learning. We hypothesize that this process discovers controllable factors of variation* in the world as well as how to control them.
	Decoupling Backpropagation using Constrained Optimization Methods Valentin Thomas, Akhilesh Gotmare, Johanni Brea and Martin Jaggi In ICML 2018 workshop on Efficent Credit Assignement We propose BlockProp which lets one train deep neural networks in model parallel fashion, where parts of the model may reside on different devices (GPUs).
	Preserving the entanglement of two qubits with feedback control Valentin Thomas, Pierre Rouchon Report for a research semester in 2014 (in french)* This research project was about designing a feeback control loop using an electromagnetic field to preserve the entanglement of two qbits. This is necessary as because of quantum decoherence the entanglement tends to vanish which is a major issue in developping quantum computer hardware. We proposed a simple Lyapunov-based feedback control loop.

Experience

Here you can find the internships I have done during my MSc at Mines Paris and at Ecole Normale Superiéure Paris-Saclay then during my PhD at University of Montreal.

Website template from Jon Barron (source code).