Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #131 from innovationOUtside/wip
Update week 8
- Loading branch information
Showing
13 changed files
with
2,324 additions
and
4,095 deletions.
There are no files selected for viewing
277 changes: 45 additions & 232 deletions
277
...i-agent systems/.md/08.1 Introducing remote services and multi-agent systems.md
Large diffs are not rendered by default.
Oops, something went wrong.
259 changes: 259 additions & 0 deletions
259
...nt systems/.md/08.2 Collecting digit image and class data from the simulator.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,259 @@ | ||
--- | ||
jupyter: | ||
jupytext: | ||
formats: ipynb,.md//md | ||
text_representation: | ||
extension: .md | ||
format_name: markdown | ||
format_version: '1.2' | ||
jupytext_version: 1.6.0 | ||
kernelspec: | ||
display_name: Python 3 | ||
language: python | ||
name: python3 | ||
--- | ||
|
||
## 2. Collecting digit image and class data from the simulator | ||
|
||
If we wanted to collect image data from the background and then train a network using those images, we would need to generate the training label somehow. We could do this manually, looking at each image and then by observation recording the digit value, associating it with the image location co-ordinates. But could we also encode the digit value explicitly somehow? | ||
|
||
If you look carefully at the *MNIST_Digits* background in the simulator, you will see that alongside each digit is a solid coloured area. This area is a greyscale value that represents the value of the digit represented by the image. That is, it represents a training label for the digit. | ||
|
||
<!-- #region tags=["alert-success"] --> | ||
*The greyscale encoding is quite a crude encoding method that is perhaps subject to noise. Another approach might be to use a simple QR code to encode the digit value.* | ||
<!-- #endregion --> | ||
|
||
As usual, load in the simulator in the normal way: | ||
|
||
```python | ||
from nbev3devsim.load_nbev3devwidget import roboSim, eds | ||
|
||
%load_ext nbev3devsim | ||
``` | ||
|
||
Clear the datalog just to ensure we have a clean datalog to work with: | ||
|
||
```python | ||
%sim_data --clear | ||
``` | ||
|
||
The solid greyscale areas are arranged so that when the left light sensor is over the image, the right sensor is over the training label area. | ||
|
||
```python | ||
%%sim_magic_preloaded -b MNIST_Digits -O -R -AH -x 400 -y 50 | ||
|
||
#Sample the light sensor reading | ||
sensor_value = colorLeft.reflected_light_intensity | ||
|
||
# This is essentially a command invocation | ||
# not just a print statement! | ||
print("image_data both") | ||
``` | ||
|
||
We can retrieve the last pair of images from the `roboSim.image_data()` dataframe using the `get_sensor_image_pair()` function: | ||
|
||
```python | ||
from nn_tools.sensor_data import zoom_img | ||
from nn_tools.sensor_data import get_sensor_image_pair | ||
|
||
# The sample pair we want from the logged image data | ||
pair_index = -1 | ||
|
||
left_img, right_img = get_sensor_image_pair(roboSim.image_data(), | ||
pair_index) | ||
|
||
zoom_img(left_img), zoom_img(right_img) | ||
|
||
``` | ||
|
||
<!-- #region tags=["alert-success"] --> | ||
The image labels are encoded as follows: | ||
|
||
`greyscale_value = 25 * digit_value` | ||
<!-- #endregion --> | ||
|
||
One way of decoding the label is as follows: | ||
|
||
- divide each of the greyscale pixel values collected from the right hand sensor array by 25; | ||
- take the median of these values and round to the nearest integer; *in a noise free environment, using the median should give a reasonable estimate of the dominant pixel value in the frame.* | ||
- ensure we have an integer by casting the result to an integer. | ||
|
||
The *pandas* package has some operators that can help us with that if we put all the data into a *pandas* *Series* (essentially, a single column dataframe): | ||
|
||
```python | ||
import pandas as pd | ||
|
||
def get_training_label_from_sensor(img): | ||
"""Return a training class label from a sensor image.""" | ||
# Get the pixels data as a pandas series | ||
# (similar to a single column dataframe) | ||
image_pixels = pd.Series(list(img.getdata())) | ||
|
||
# Divide each value in the first column (name: 0) by 25 | ||
image_pixels = image_pixels / 25 | ||
|
||
# Find the median value | ||
pixels_median = image_pixels.median() | ||
|
||
# Find the nearest integer and return it | ||
return int( pixels_median.round(0)) | ||
|
||
# Try it out | ||
get_training_label_from_sensor(right_img) | ||
``` | ||
|
||
The following function will grab the right and left images from the data log, decode the label from the right hand image, and return the handwritten digit from the left light sensor along with the training label: | ||
|
||
```python | ||
def get_training_data(raw_df, pair_index): | ||
"""Get training image and label from raw data frame.""" | ||
|
||
# Get the left and right images | ||
# at specified pair index | ||
left_img, right_img = get_sensor_image_pair(raw_df, | ||
pair_index) | ||
|
||
# Find the training label value as the median | ||
# value of the right habd image. | ||
# Really, we should properly try to check that | ||
# we do have a proper training image, for example | ||
# by encoding a recognisable pattern | ||
# such as a QR code | ||
training_label = get_training_label_from_sensor(right_img) | ||
return training_label, left_img | ||
|
||
|
||
# Try it out | ||
label, img = get_training_data(roboSim.image_data(), | ||
pair_index) | ||
print(f'Label: {label}') | ||
zoom_img(img) | ||
``` | ||
|
||
<!-- #region tags=["alert-danger"] --> | ||
We're actually taking quite a lot on trust in extracting the data from the dataframe in this way. Ideally, we would have a unique identifiers that reliably associate the left and right images as having been sampled from the same location. As it is, we assume the left and right image datasets appear in that order, one after the other, so we can count back up the dataframe to collect different pairs of data. | ||
<!-- #endregion --> | ||
|
||
Load in our previously trained MLP classifier: | ||
|
||
```python | ||
# Load model | ||
from joblib import load | ||
|
||
MLP = load('mlp_mnist14x14.joblib') | ||
``` | ||
|
||
We can now test that image against the classifier: | ||
|
||
```python | ||
from nn_tools.network_views import image_class_predictor | ||
|
||
image_class_predictor(MLP, img) | ||
``` | ||
|
||
<!-- #region activity=true --> | ||
### 2.3.1 Activity — Testing the ability to recognise images slight off-center in the image array | ||
|
||
Write a simple program to collect sample data at a particular location and then display the digit image and the decoded label value. | ||
|
||
Modify the x or y co-ordinates used to locate the robot by by a few pixel values away from the sampling point origins and test the ability of the network to recognise digits that are lightly off-center in the image array. | ||
|
||
How well does the network perform? | ||
|
||
*Hint: when you have run your program to collect the data in the simulator, run the `get_training_data()` with the `roboSim.image_data()` to generate the test image and retrieve its decoded training label.* | ||
|
||
*Hint: use the `image_class_predictor()` function with the test image to see if the classifier can recognise the image.* | ||
|
||
*Hint: if you seem to have more data in the dataframe than you thought you had collected, did you remember to clear the datalog before collecting your data?* | ||
<!-- #endregion --> | ||
|
||
```python | ||
# Your code here | ||
``` | ||
|
||
<!-- #region student=true --> | ||
*Record your observations here.* | ||
<!-- #endregion --> | ||
|
||
<!-- #region activity=true --> | ||
### 2.3.2 Activity — Collecting image sample data from the *MNIST_Digits* background (optional) | ||
|
||
In this activity, you will need to collect a complete set of sample data from the simulator to test the ability of the network to correctly identify the handwritten digit images. | ||
|
||
Recall that the sampling positions are arranged along rows 100 pixels apart, starting at x=100 and ending at x=2000; | ||
along columns 100 pixels apart, starting at y=50 and ending at y=1050. | ||
|
||
Write a program to automate the collection of data at each of these locations. | ||
|
||
How would you then retrieve the hand written digit image and it's decoded training label? | ||
|
||
*Hint: import the `time` package and use the `time.sleep` function to provide a short delay between each sample collection. You may also find it convenient to import the `trange` function to provide a progress bar indicator when iterating through the list of collection locations: `from tqdm.notebook import trange`.* | ||
<!-- #endregion --> | ||
|
||
<!-- #region student=true --> | ||
*Your program design notes here.* | ||
<!-- #endregion --> | ||
|
||
```python student=true | ||
# Your program code | ||
``` | ||
|
||
<!-- #region student=true --> | ||
*Describe here how you would retrieve the hand written digit image and it's decoded training label.* | ||
<!-- #endregion --> | ||
|
||
<!-- #region activity=true heading_collapsed=true --> | ||
#### Example solution | ||
|
||
*Click on the arrow in the sidebar or run this cell to reveal an example solution.* | ||
<!-- #endregion --> | ||
|
||
<!-- #region activity=true hidden=true --> | ||
To collect the data, I use two `range()` commands, one inside the other, to iterate through the *x* and *y* coordinate values. The outer loop generates the *x* values and the inner loop generates the *y* values: | ||
<!-- #endregion --> | ||
|
||
```python activity=true hidden=true | ||
# Make use of the progress bar indicated range | ||
from tqdm.notebook import trange | ||
import time | ||
|
||
# Clear the datalog so we know it's empty | ||
%sim_data --clear | ||
|
||
|
||
# Generate a list of integers with desired range and gap | ||
min_value = 50 | ||
max_value = 1050 | ||
step = 100 | ||
|
||
for _x in trange(100, 501, 100): | ||
for _y in range(min_value, max_value+1, step): | ||
|
||
%sim_magic -R -x $_x -y $_y | ||
# Give the data time to synchronise | ||
time.sleep(1) | ||
``` | ||
|
||
<!-- #region activity=true hidden=true --> | ||
We can now grab and view the data we have collected: | ||
<!-- #endregion --> | ||
|
||
```python activity=true hidden=true | ||
training_df = roboSim.image_data() | ||
training_df | ||
``` | ||
|
||
<!-- #region activity=true hidden=true --> | ||
The `get_training_data()` function provides a convenient way of retrieving the handwritten digit image and the decoded training label. | ||
<!-- #endregion --> | ||
|
||
```python activity=true hidden=true | ||
label, img = get_training_data(training_df, pair_index) | ||
zoom_img(img), label | ||
``` | ||
|
||
## 2.4 Summary | ||
|
||
In this notebook, you have automated the collection of hand-written digit and encoded label image data from the simulator ad seen how this can be used to generate training data made up of scanned handwritten digit and image label pairs. In principle, we could use the image and test label data collected in this way as a training data set for an MLP or convolutional neural network. | ||
|
||
The next notebook in the series is optional and demonstrates the performance of a CNN on the MNIST dataset. The required content continues with a look at how we can start to collect image data using the simulated robot whilst it is on the move. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.