Next frame prediction¶
The project consists of predicting the next frame in a short movie sequence.
For this project, we shall use the Moving MNIST dataset, composed of 10,000 video sequences, each consisting of 20 frames. In each video sequence, two digits move independently around the frame, which has a spatial resolution of 64×64 pixels. The digits frequently intersect with each other and bounce off the edges of the frame.
While each sequence has a lenght of 20, your are supposed to use only 3 consecutive frames as input, and predict the next one.
The metric used to evalaute the quality of the predicted frame is Mean Squared Error.
import tensorflow as tf
import tensorflow_datasets as tfds
from tensorflow import keras
from keras import layers
import math
import matplotlib.pyplot as plt
import numpy as np
ds = tfds.as_numpy(tfds.load(
'moving_mnist',
split='test',
batch_size=-1
))
sequences = ds['image_sequence']
WARNING:absl:You use TensorFlow DType <dtype: 'uint8'> in tfds.features This will soon be deprecated in favor of NumPy DTypes. In the meantime it was converted to uint8. WARNING:absl:`FeatureConnector.dtype` is deprecated. Please change your code to use NumPy with the field `FeatureConnector.np_dtype` or use TensorFlow with the field `FeatureConnector.tf_dtype`.
The dataset is composed of 10000 sequences of 20 frames each. Each (grayscale) frame has dimnesion 64x64
sequences = np.squeeze(np.swapaxes(sequences, 1, 4),axis=1)/255.
print(sequences.shape)
print(np.min(sequences),np.max(sequences))
(10000, 64, 64, 20) 0.0 1.0
Let us split the dataset in training, validation and testing. You are supposed to evaluate the performance of your model using MSE over the full test set.
trainset = sequences[:8000]
valset = sequences[8000:9000]
testset = sequences[8000:9000]
Here is a simple generator, creating the input sequences of 3 frames, and the expected output, namely the next frame.
def image_generator(dataset,batchsize=16,seqlen=4):
while True:
batch_x = np.zeros((batchsize,64,64,seqlen-1))
batch_y = np.zeros((batchsize,64,64,1))
ran = np.random.randint(dataset.shape[0],size=batchsize)
minibatch = dataset[ran]
#these sequences have length 20; we reduce them to seqlen
for i in range(batchsize):
random_start = np.random.randint(0,20-seqlen)
random_end = random_start+seqlen-1
batch_x[i] = minibatch[i,:,:,random_start:random_end]
batch_y[i] = minibatch[i,:,:,random_end:random_end+1]
#print(batch_x.shape,batch_y.shape)
#print(batch_x.min(),batch_x.max())
#print(batch_x.min(),batch_x.max())
yield(batch_x,batch_y)
prova_gen = image_generator(testset,batchsize=1,seqlen=4)
sample_x, sample_y = next(prova_gen)
(1, 64, 64, 3) (1, 64, 64, 1)
print(type(sample_x))
<class 'numpy.ndarray'>
def show_list(images): #takes in input a list of images and plot them
size = len(images)
plt.figure(figsize=(10, 10 * size))
for i in range(size):
plt.subplot(1, size, i + 1)
plt.imshow(images[i],cmap='gray',)
plt.axis("off")
plt.tight_layout()
plt.show()
plt.close()
all = [sample_x[0,:,:,i] for i in range(3)]+[sample_y[0,:,:,0]]
show_list(all)
train_gen = image_generator(trainset)
val_gen = image_generator(valset)
test_gen = image_generator(valset)
sample_x, sample_y = next(train_gen)
(16, 64, 64, 3) (16, 64, 64, 1)
What to deliver¶
As usual you are supposed to deliver a single notebook, sufficiently documented. Do not erase the output cell of your notebook after the last execution. Especially, leave a sufficently verbose trace of training.
As already stated, the model must be evalauted on the full test set.