Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when training starts. ZeroDivisionError: integer division or modulo by zero #9

Open
mpwsh opened this issue Nov 23, 2017 · 1 comment

Comments

@mpwsh
Copy link

mpwsh commented Nov 23, 2017

I know you pretty much abandoned this project, but i'm trying to make it work with tensorflow 0.12.1 and i'm getting this when the training "starts" (actually it freezes at 0% and then shows this error)

Model creation...
WARNING:tensorflow:From /media/sata/MusicGenerator/deepmusic/model.py:246 in _build_network.: scalar_summary (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2016-11-30.
Instructions for updating:
Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported.
E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: NVIDIA Tegra X1
major: 5 minor: 3 memoryClockRate (GHz) 0.9984
pciBusID 0000:00:00.0
Total memory: 3.89GiB
Free memory: 2.85GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0)
E tensorflow/core/common_runtime/gpu/gpu_device.cc:586] Could not identify NUMA node of /job:localhost/replica:0/task:0/gpu:0, defaulting to 0.  Your kernel may not have been built with NUMA support.
Initialize variables...
WARNING: No previous model found, but some files/folders found at /media/sata/MusicGenerator/save/model. Cleaning...
Removing /media/sata/MusicGenerator/save/model/train/events.out.tfevents.1511396622.tegra-ubuntu
Start training (press Ctrl+C to save and exit)...

------- Epoch 1 (lr=0.0001) -------
Subsampling the songs (train)...
Shuffling the dataset...
Generating batches...
Subsampling the songs (test)...
Shuffling the dataset...
Generating batches...
Training:   0%|                                                                                                | 0/2 [00:00<?, ?it/s]Traceback (most recent call last):
  File "main.py", line 29, in <module>
    composer.main()
  File "/media/sata/MusicGenerator/deepmusic/composer.py", line 197, in main
    self._main_train()
  File "/media/sata/MusicGenerator/deepmusic/composer.py", line 255, in _main_train
    next_batch_test = batches_test[self.glob_step % len(batches_test)]  # Generate test batches in a cycling way (test set smaller than train set)
ZeroDivisionError: integer division or modulo by zero

Any ideas?
Does this relates to the tags warning?

@BLACKMogus
Copy link

File "/media/sata/MusicGenerator/deepmusic/composer.py", line 255, in _main_train
You should open this script composer.py and find this line :
next_batch_test = batches_test[self.glob_step % len(batches_test)]
the parameter len(batches_test) here is zero because the batches_test is empty.
the batches_test is got by this code,in the same script:
batches_train, batches_test = self.music_data.get_batches()
Now is the solution:
Open the batchbuilder.py script and find :
def get_list(self, dataset, name):
""" See parent class for more details
Args:
dataset (list[Song]): the training/testing set
name (str): indicate the dataset type
Return:
list[Batch]: the batches to process
"""
focus on the code
for i in range((nb_samples//self.args.batch_size)):
yield extracts[i*self.args.batch_size:(i+1)*self.args.batch_size]
the main reason is that nb_samples is too small to divided by self.args.batch_size(equal 0)
so this code don't work,you should input some long songs
What I input are the Bach Piano Songs ,which I download from Internet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants