This course is your ideal entry point into the exciting field of reinforcement learning, where digital artificial intelligence agents are designed to automatically learn to make sequential decisions through trial and error. Specifically, this course focuses on the problems of multi-armed bandits and the practical implementation of various algorithmic strategies to balance between exploration and exploitation. Any time you want to systematically make the best choice from a limited number of options over time, you are faced with a multi-armed bandit problem and this course teaches you all the details you need to know to be able to build realistic sales agents to manage. .such situations.
With very concise explanations, this course teaches you how to confidently and painlessly translate seemingly scary math formulas into Python code. We understand that few of us are technically adept at math, which is why this course intentionally stays away from math unless necessary. And even when it becomes necessary to talk about mathematics, the approach taken in this course is such that anyone with basic algebra skills can understand and above all easily translate mathematics into code and build useful intuitions in the process.
Some of the algorithmic strategies taught in this course are Epsilon Greedy, Softmax Exploration, Optimistic Initialization, Upper Confidence Bounds, and Thompson Sampling. With these tools under your belt, you are sufficiently equipped to easily create and deploy AI agents that can manage critical business operations under conditions of uncertainty.
To bridge the gap between theory and application, I have updated this course to include a section in which I show how to apply MAB algorithms in robotics using EV3 Mindstorm. I will be uploading a section soon that will show how to apply the algorithms taught in this course to optimize ads.
What you will learn
- Understand and be able to identify multi-armed bandit problems.
- Modeling real business problems as MAB and implementing digital AI agents to automate them.
- Understand the RL issue regarding the exploration-exploitation dilemma.
- Practical implementation of different algorithmic strategies for balancing exploration and exploitation.
- Python implementation of the Epsilon-greedy strategy.
- Python implementation of the Softmax Exploration strategy.
- Python implementation of the optimistic initialization strategy.
- Python implementation of the Upper Confidence Bounds (UCB) strategy.
- Understand the challenges of RL in terms of reward function design and sample efficiency.
- Estimation of action values by incremental sampling.
Who is this course for?
- Anyone with basic Python skills and wanting to get started with reinforcement learning.
- Experienced AI engineers, ML engineers, data scientists, and software engineers interested in applying reinforcement learning to real-world business problems.
- Professionals interested in learning how reinforcement learning can help automate adaptive decision-making processes.
Specifying Practical Multi-Armed Bandit Algorithms in Python
- editor: udemie
- Teacher: Édouard tart
- French language
- Subtitle: English
- level: All levels
- Number of courses: 13
- Duration: 3 hours and 45 minutes
Course content 2022/2
Requirements
- Be able to understand basic OOP programs in Python.
- Have basic knowledge of Numpy and Matplotlib.
- Basic algebra skills. If you know how to add, subtract, multiply, and divide numbers, you’re good to go.
Practical Multi-Armed Bandit Algorithms Course Images in Python
Example video
Installation guide
Extract the files and watch them with your favorite player
Subtitle: English
Quality: 720p
Download link
Password: www.downloadly.ir
file size
1.16 GB