Coding and STEM

Unit 4: Computer Vision 2 - Photo Recognition of Dogs & Cats

Objective

In this unit we will apply the concept of computer vision to classify and recognize the photographs of dogs and cats. Although it is simple, similar applications of AI are very common in every day life now.

Teaching (Basic Concept of Computer Vision)

In the camera of AlphAI, we call the zones "pixels". You can customize the camera into different resolution 2 x 1, 4 x 3, 8 x 6, etc. The camera below is set to 4 x 3 pixels.

Note that these pixel inputs will become input to the neural network for further processing.

We have explained the basic concept of computer vision. What computer machine sees is a digital image captured by the camera which is divided into pixels (or zones). The below image has 16 pixels (or zones). It is captured by the AlphAI camera with 4 x 3 pixels.

In the pixel boxes, there are numbers - 249, 156 and 248 in column 1.

These numbers represents the intensity of light captured in the pixels. The numbers are ranged from 0 to 255.

The bigger the number is, the pixel is more bright. For example, 249 and 250 are bright.

The smaller the number is, the pixel is more dark (black). For example, 156 and 90 are dark.

** Note that these numbers ranged from 0 to 255 are usually converted to a range 0 to 1.00 for the further processing by the machine.

====

If photo recognition is performed, it is required to have more pixels (for example, 32 x 24 or even bigger resolution).

If the photos are colourful, it is better to choose 3-colour parameter set up for the camera.

Activity 1 - train the robot to classify photos of dogs and cats

Material:

We use 4 (or more) large photos (larger than 10 cm x 10 cm) as inputs for the AI training. Students can be asked to collect these photos before the lessons. Some examples are given below.

To ensure the success of this activity, please pay attention to below points:

the photos should be consistently placed in same distance and angle from the robot camera each time. A stable photo stand is recommended. A flat hard card should be used to help the photo stand up.
the main object (e.g. head of cat or head of dog) should occupy majority (e.g. 80%) of the image
background should be kept simple and stable. Changing of background each time will affect accuracy of training.
same photograph should be trained 4 to 5 times with little movement toward left, right, up and down each time. This will help the matching easier when you test or use the AI later.

AI Parameter Set Up

Download the parameter file from...

U04a Classification (Dogs and Cats).json

Load the parameter file "U04a Classification (Dogs and Cats).json"

OR ... (for advanced students) you can follow the below instruction to set up the parameters

Sensors

camera 32 x 24 pixels, 3-colour

Actions

2 customized outputs
- "Dogs "
- "Cats "

A.I.

Learning method - supervised learning
Algorithm - neural network
Hidden Layer - none

Visualization

Animation
Connection activity

Start the experiment

Switch on and connect the robot to your PC.

With the AI parameters set up correctly, the following will be shown.

Training

- click the <reset learning> button once

- click the <learning> button

- put the photo on the stand.

Look at the camera on the computer screen.
If it is detected correctly, click the output for the right classification that you want to train the robot.
Wait until thick yellow connection(s) are pointing to the desired output.
The output neuron of the desired output should turn yellow too and its value should be increasing and become largest.
With the same photo, move it slightly toward left (or right or up or down) and click the output for the right classification. Repeat this step 4 times.

- repeat the above training for all photos.

Testing

- Off the <learning> button.

- Click the <self drive> button. The robot will use the learned intelligence to recognize the photos.

- test with all the photos.

- Are the result correct?

- if result is not 100% correct, you may need to re-train some of the cards.

Discussion

How many input pixels are there?

Do you think the number of pixels will help the classification result?

Do you think the 3-colour will help the classification result?

Answer

How many input pixels are there? 32 x 24 x 3 colours = 2,304 pixels

Do you think the number of pixels will help the classification result? Yes.

Do you think the 3-colour will help the classification result? Yes.

(Extended) Activity 2 - train the robot to recognize the photos of dogs and cats

Can AlphAI be trained to recognize the photos of dogs and cats?

Material

use the same 4 photos of dogs and cats
you can customize the activities for more photos (e.g. 6 or 8)

AI Parameter Setup

Download the parameter file from ...

U04b Recognition (Dogs and Cats).json

Load the parameter file "U04b Recognition (Dogs and Cats).json"