0

(pre-processing for qiskit QGAN but the use case is somewhat irrelevant) I'm a bit lost trying to figure out how to preprocess an image dataset before passing it through a GAN. Below is all the relevant code up to my error. This code is derived from https://github.com/Qiskit/qiskit-tutorials/blob/master/legacy_tutorials/aqua/machine_learning/qgans_for_loading_random_distributions.ipynb and has been (attempted to be) altered to accommodate for a different input dataset. (The original has generated dummy data of much simpler dimensions)

# Root directory for dataset
dataroot = "./data/land"

# Number of workers for dataloader
workers = 2

# Batch size during training
batch_size = 128

#img size
image_size = 64

dataset = dset.ImageFolder(root=dataroot,
                           transform=transforms.Compose([
                               transforms.Resize(image_size),
                               transforms.CenterCrop(image_size),
                               transforms.ToTensor(),
                               transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
                           ]))
# Create the dataloader
dataloader = torch.utils.data.DataLoader(dataset, batch_size=batch_size,
                                         shuffle=True, num_workers=workers)

real_batch = next(iter(dataloader))
real_batch_arr= [t.numpy() for t in real_batch]


# Set number of qubits per data dimension as list of k qubit values[#q_0,...,#q_k-1]
num_qubits = [4]
k = len(num_qubits)


num_epochs = 100

# Initialize qGAN
qgan = QGAN(real_batch_arr,bounds=bounds, num_qubits = num_qubits,batch_size = 128, num_epochs=num_epochs, snapshot_dir=None)

This gives me is the following error.

ValueError                                Traceback (most recent call last)
<ipython-input-42-8cba9a74f024> in <module>
      5 
      6  # Initialize qGAN
----> 7 qgan = QGAN(real_batch_arr,bounds=bounds, num_qubits = num_qubits,batch_size = 128, num_epochs=num_epochs, snapshot_dir=None)
      8 qgan.seed = 1
      9 # Set quantum instance to run the quantum generator

~\Anaconda3\lib\site-packages\qiskit\aqua\algorithms\distribution_learners\qgan.py in __init__(self, data, bounds, num_qubits, batch_size, num_epochs, seed, discriminator, generator, tol_rel_ent, snapshot_dir, quantum_instance)
     99         if data is None:
    100             raise AquaError('Training data not given.')
--> 101         self._data = np.array(data)
    102         if bounds is None:
    103             bounds_min = np.percentile(self._data, 5, axis=0)

ValueError: could not broadcast input array from shape (128,3,64,64) into shape (128)

I understand that the qiskit function (QGAN) at some point is attempting to turn real_batch_arr to an array (which is defined as a list when passed to QGAN). This array is expected to be just (128) however, on top of that, an array needs to be passed to QGAN, not a list (based from the original code linked above).

My question is how would I be able to transform my list into the array that I need. There also could be something I am simply fundamentally missing. I truly appreciate any advice or comments.

jan
  • 123
  • 9

1 Answers1

0

The current implementation of the qGAN algorithm does not support data sets which are given as a tensor. The data is required to be either given as a flat array or an array of k-dimensional data points, i.e., the shape of the data should be num_data_samples x dim_data_samples.