Unit 2: Computer Vision 1

Objective

This unit will tell you the basic concept of computer vision.  Computer vision is an important source of input for AI application.  Knowing it will allow you to do a lot of AI experiment easily and interestingly.

With the knowledge of computer vision, we will conduct 2 simple AI activities. 

Teaching (Basic Concept of Computer Vision)

Computer vision enables computers and systems to derive meaningful information from digital images  — and take actions based on that information. If AI enables computers to think, computer vision enables them to see.

Computer vision works much the same as human vision Human sight has the advantage of lifetimes of context to train how to tell objects apart, how far away they are, whether they are moving and whether there is something wrong in an image.

Computer vision trains machines to perform these functions, but it has to do it in much less time with cameras, data and algorithms rather than retinas, optic nerves and a visual cortex.

The following shows how a computer machine read digital images.  the setting is in grayscale (black and white).  It divides the image into zones.  How many zones are there in the below set up?

In AlphAI, we call the zones "pixels".  You can customize the camera into different resolution 2 x 1, 4 x 3, 8 x 6, etc.  The camera above is set to 4 x 3 pixels.

Look at the below examples.

The bigger the dimension of the pixels, the more accurate the machine can recognize the shape and details of the image.  But this requires more computer storage and processing power to make it work.

The last example is for color image.  AlphAI will convert the RGB into another 3-color standard which is more close to human perception.

Note that these pixel inputs will become input to the neural network for further processing.

Activity 1 - train the robot to recognize number 1, 4, and 9

Material:

3 cards (about 10 cm x 10 cm) are used to represent the number 1, 4, and 9

Setup:

(1) The robot is put on top of small solid blocks which can allow the wheels running freely in air without moving.

(2) The robot (the camera) is put 4 to 5 cm in front of a stand (or a wall).

AI Setting

Sensors

  • camera 4 x 3 pixels, gray scale

Actions

  • 3 outputs
  • 1 go forward, and 2 x forward turn

A.I.

  • Learning method - supervised learning
  • Algorithm - neural network
  • Hidden Layer - none

Visualization

  • Animation
  • Connection activity

 

Labeled Outputs

Start the experiment

Switch on and connect the robot to your PC.

With the AI parameters set up correctly, the following will be shown.

 

Training

- click the <reset learning> button once

- click the <learning> button

- put the card "4" on the stand.  Look at the camera on the computer screen.  If it is detected correctly, click the output according to the label chart (right forward turn).

- put the card "1" on the stand.  Look at the camera on the computer screen.  If it is detected correctly, click the output according to the label chart (go forward).

- put the card "9" on the stand.  Look at the camera on the computer screen.  If it is detected correctly, click the output according to the label chart (left turn).

- repeat the above training 4 to 5 times for each card, with the card moved horizontally a little bit to both sides and tilted slightly.

 

Testing

- Off the <learning> button.

- Click the <self drive> button.   The robot will use the learned intelligence to recognize the card.

- test with the 3 cards.

- Are the result correct?

- if result is not 100% correct, you may need to re-train some of the cards.

 

Discussion

How many input pixels are there?

How could the input pixels help to recognize the numbers?

 

Ideas for Extended Activities

(1) recognize numbers 1 to 9

(2) recognize alphabets (e.g. A, B, C, D, E)