changes #2
22
README.md
22
README.md
@ -1,8 +1,7 @@
|
||||
# ANZ Reinforcement machine learning "crash" course
|
||||
|
||||
Welcome to the second ANZ Summer of Tech Bootcamp for 2020!
|
||||
|
||||
# Introduction -J
|
||||
# Introduction -A
|
||||
|
||||
To get started, I want to introduce the [AWS DeepRacer](https://www.deepracerleague.com) league. DeepRacer is a reinforcement maching learning based global autonomous car racing league. Anyone can participate in the league online and in person person events are held regularly as well.
|
||||
|
||||
@ -22,11 +21,11 @@ Earlier this year in February [@Alex](https://github.com/alextaikato), [@Chris](
|
||||
|
||||
# How this session will run -A
|
||||
|
||||
- Everyone has an opportunity to participate - This is intended to be a hands on session. After some initial discussion as one group together, we can split out into teams using the breakout rooms feature in Zoom, or work individually to develop our own machine learning models and see how these work in simulations.
|
||||
- Everyone has an opportunity to participate - This is intended to be a hands on session. We will go over setting up your deepracer environment and walk you through setting up your first model.
|
||||
|
||||
- Asking questions - The aim of this session is to be as interactive as possible, please don’t hold questions to the end, I’m happy to answer questions as we go :)
|
||||
|
||||
- Resources - All materials from this workshop will be publicly available on my github and a link provided in chat at the end of our session.
|
||||
- Resources - All materials from this workshop will be publicly available on github and a link provided in chat at the end of our session.
|
||||
|
||||
- Technology - Today we'll be using AWS to get hands on, it was a pre-requisite to register for an AWS account. If you don't have one already, please register now as we start our walkthrough.
|
||||
|
||||
@ -36,7 +35,7 @@ Earlier this year in February [@Alex](https://github.com/alextaikato), [@Chris](
|
||||
|
||||
|
||||
# Section 1: Training a model together
|
||||
## Step 1: Lets login to the AWS DeepRacer service and create resources -J
|
||||
## Step 1: Lets login to the AWS DeepRacer service and create resources -C
|
||||
|
||||
To get underway we'll complete this first walkthrough together as a group, just to give you an idea how DeepRacer works. Once we've completed the walkthrough you'll have time to work on a model as a team or individually.
|
||||
|
||||
@ -116,7 +115,7 @@ You should now be back in the Garage and see your vehicle.
|
||||
|
||||
Please expand the left-hand nav bar and select **Models**.
|
||||
|
||||
## Step 3: Model List Page -J
|
||||
## Step 3: Model List Page -C
|
||||
The **Models** page shows a list of all the models you have created and the status of each model. If you want to create models, this is where you start the process. Similarly, from this page you can download, clone, and delete models. If this is the first time you are using the service and have just created your resources, you should see a few sample models in your account.
|
||||
|
||||

|
||||
@ -126,7 +125,7 @@ You can create a model by selecting **Create model**. Once you have created a mo
|
||||
To create your model select **Create model**.
|
||||
|
||||
|
||||
## Step 4: Create model -J
|
||||
## Step 4: Create model -A
|
||||
This page gives you the ability to create an RL model for AWS DeepRacer and start training the model. We are going to create a model that can be used by the AWS DeepRacer car to autonomously drive around a virtual race track. We need to select the specific race track we want to train on, specify the reward function that will be used to incentivize our desired driving behavior during training, configure hyperparameters, and specify our stopping conditions.
|
||||
|
||||

|
||||
@ -359,7 +358,6 @@ The default parameters work well for time-trial models where the max speed is le
|
||||
| Batch size | The number recent of vehicle experiences sampled at random from an experience buffer and used for updating the underlying deep-learning neural network weights. If you have 5120 experiences in the buffer, and specify a batch size of 512, then ignoring random sampling, you will get 10 batches of experience. Each batch will be used, in turn, to update your neural network weights during training. Use a larger batch size to promote more stable and smooth updates to the neural network weights, but be aware of the possibility that the training may be slower. |
|
||||
| Number of epochs | An epoch represents one pass through all batches, where the neural network weights are updated after each batch is processed, before proceeding to the next batch. 10 epochs implies you will update the neural network weights, using all batches one at a time, but repeat this process 10 times. Use a larger number of epochs to promote more stable updates, but expect slower training. When the batch size is small,you can use a smaller number of epochs. |
|
||||
| Learning rate | The learning rate controls how big the updates to the neural network weights are. Simply put, when you need to change the weights of your policy to get to the maximum cumulative reward, how much should you shift your policy. A larger learning rate will lead to faster training, but it may struggle to converge. Smaller learning rates lead to stable convergence, but can take a long time to train. |
|
||||
| Exploration | This refers to the method used to determine the trade-off between exploration and exploitation. In other words, what method should we use to determine when we should stop exploring (randomly choosing actions) and when should we exploit the experience we have built up. Since we will be using a discrete action space, you should always select CategoricalParameters. |
|
||||
| Entropy | A degree of uncertainty, or randomness, added to the probability distribution of the action space. This helps promote the selection of random actions to explore the state/action space more broadly. |
|
||||
| Discount factor | A factor that specifies how much the future rewards contribute to the expected cumulative reward. The larger the discount factor, the farther out the model looks to determine expected cumulative reward and the slower the training. With a discount factor of 0.9, the vehicle includes rewards from an order of 10 future steps to make a move. With a discount factor of 0.999, the vehicle considers rewards from an order of 1000 future steps to make a move. The recommended discount factor values are 0.99, 0.999 and 0.9999. |
|
||||
| Loss type | The loss type specified the type of the objective function (cost function) used to update the network weights. The Huber and Mean squared error loss types behave similarly for small updates. But as the updates become larger, the Huber loss takes smaller increments compared to the Mean squared error loss. When you have convergence problems, use the Huber loss type. When convergence is good and you want to train faster, use the Mean squared error loss type. |
|
||||
@ -389,10 +387,10 @@ Note 25 to 35 minutes of lab time should have elapsed by this point.
|
||||
|
||||
**Important to note** that once you have started training a model using a particular agent (car), the settings of the agent remains with the model, even if you go and change the agent in the Garage. Thus changes to your agents will not affect your existing models, but will only affect future models that you start training.
|
||||
|
||||
# Section 2: Competing in the ANZ DeepRacer Virtual Race -A
|
||||
# Section 2: Competing in the ANZ DeepRacer Race -A
|
||||
|
||||
Now that you have a model created and have done some training, come along and race your model in a virtual race limited to attendees of this workshop.
|
||||
Now that you have a model created and have done some training, complete your training of these models over the next few days. Feel free to search online for what works and what doesn't and don't be scared to experiment with it
|
||||
|
||||
First, second and third places will be awarded some sweet prizes so give it a go!
|
||||
We will have some prizes on the day for the winning teams so bring your A game!
|
||||
|
||||
You can enter your model in the race here: https://console.aws.amazon.com/deepracer/home#raceToken/YNzi5qRaTgOQjNjVmgCDFA
|
||||
Bring your models along on race day and we can transfer them over to the car to test them out.
|
||||
|
||||
Reference in New Issue
Block a user