Unit 10: Reinforcement Learning 1 - Blocked vs. Motion (Elementary Level)

Objective

This experiment introduces Reinforcement Learning method which is an advanced type of AI learning method.  With the Reward set up properly, the robot can learn by itself.

The experiment objective is to train the robot to go smoothly in the arena.

Teaching - Reinforcement Learning

The principle of reinforcement learning (self learning by robot) is to encourage the robot to do certain actions by giving it rewards and punishments.  The robot will then modify its behaviour in order to maximize its level.

The faster the robot goes, the more reward it will get , but if it stops or goes backward, it will get punishment.  

To maximize its level, the robot should go straight as often as possible and stop/backward as little as possible.

Activity - experiment with Reinforcement Learning

Material:

An arena with 4 walls with red colours.  

AI Parameters

Use demo parameters "Blocked vs. Motion" as below

Start the experiment

Switch on and connect the robot to your PC.  Press the <Connection> button.

With the the AI parameters selected correctly, the following neural network will be shown.

Reward and Level

- Reward Motion (moving):

+100 for Forward,

+30 for Forward Left Turn,

+30 for Forward Right Turn

- Penalty

if speed is zero (blocked) or backward: -50

- Level is the average of its rewards over the last 2 minutes

 

Label the Output

Self explanatory

(A) Experiment 1 - Self Drive without Learning

- click the <reset learning> button once

Refer to below diagram to set the buttons

- off the <learning> button

- off the <exploration> button

- click the <self drive> button

- watch how the robot behaves

Discussion

- describe how the robot behaves

- what happens to the Reward and Level?

- does learning happens?

 

(A)Experiment 2 - Learning without Exploration

- click the <reset learning> button once

Refer to below diagram to set the buttons

- click the <learning> button

- off the <exploration> button

- click the <self drive> button

- watch how the robot behaves

Discussion

- describe how the robot behaves

- what happens to the Reward and Level?

- does learning happens?

- is the robot satisfied with the Reward and the Level?

- are you satisfied with the Reward and the Level?

 

(C) Experiment 3 - Learning with Exploration

- click the <reset learning> button once

Refer to below diagram to set the buttons

- click the <learning> button

- click the <exploration> button

- click the <self drive> button

- watch how the robot behaves

- could the robot get high reward?  For example, +100.

- when the arrow turns blue, the robot is exploring.  Have you observed the explorations?

- when the level reach 50, off (stop) the learning button

Testing

- Off the <learning> button.  Click the <self drive> button.  The robot will used the learned intelligence to move around.

- test what is the highest Level the robot can achieve?

- Does the learned intelligence work well and keep the robot moving smoothly?

 

Discussion

Can the robot move around smoothly.

Has learning been achieved?  Has the robot gained intelligence after the self learning?

Discuss the concept of reinforcement learning.

Do you think exploration is important for self-learning>

What is important in order to make reinforcement learning successful?

 

**For students want to learn more about the reinforcement learning and neural network, you can continue the Intermediate Level.