Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OverflowError: unsigned byte integer is greater than maximum #12

Open
MacwinWin opened this issue Jul 3, 2017 · 9 comments
Open

OverflowError: unsigned byte integer is greater than maximum #12

MacwinWin opened this issue Jul 3, 2017 · 9 comments

Comments

@MacwinWin
Copy link

MacwinWin commented Jul 3, 2017

When my dataset goes to 77000+ images, there is an error:

Traceback (most recent call last): File "convert-images-to-mnist-format.py", line 46, in <module> header.append(long('0x'+hexval[2:][2:],16)) OverflowError: unsigned byte integer is greater than maximum

Can anybody help me?
I think it is because images too many, how to fix it?

@MacwinWin
Copy link
Author

https://github.com/gskielian/JPG-PNG-to-MNIST-NN-Format/issues/3
this can solve my problem!!

@richardimms
Copy link

Still facing this issue and the above link does not work.

@KumarLamic
Copy link

is this error solved yet?? cause i am getting this same problem....

@richardimms
Copy link

Hey,

So my solution to this was to do the following.

`
hexval = "{0:#0{1}x}".format(len(FileList),10) # number of files in HEX

    # header for label array

    header = array('B')
    header.extend([0,0,8,1])
    header.append(int('0x'+hexval[2:][:2],16))
    header.append(int('0x'+hexval[4:][:2],16))
    header.append(int('0x'+hexval[6:][:2],16))
    header.append(int('0x'+hexval[8:][:2],16))

    
    data_label = header + data_label

    hexval = "{0:#0{1}x}".format(width,10) # width in HEX
    header.append(int('0x'+hexval[2:][:2],16))
    header.append(int('0x'+hexval[4:][:2],16))
    header.append(int('0x'+hexval[6:][:2],16))
    header.append(int('0x'+hexval[8:][:2],16))

    hexval = "{0:#0{1}x}".format(height,10) # height in HEX
    header.append(int('0x'+hexval[2:][:2],16))
    header.append(int('0x'+hexval[4:][:2],16))
    header.append(int('0x'+hexval[6:][:2],16))
    header.append(int('0x'+hexval[8:][:2],16))

`

@ghulammustufa31
Copy link

I am having the same issue. The code you provided above did not help in my case. I have an image size of 800x500.

@enjalparajuli
Copy link

@ghulammustufa31 Did you figure it out? I am stuck on this

@KumarLamic
Copy link

#this worked for me

import os
from PIL import Image
from array import *
from random import shuffle

#WARNING: resize images first

Load from and save to

Names = [['./test-imgs-rot','test']]

for name in Names:
data_image = array('B')
data_label = array('B')

FileList = []
for dirname in os.listdir(name[0])[:]:
	path = os.path.join(name[0],dirname)
	for filename in os.listdir(path):
		if filename.endswith(".png"):
			FileList.append(os.path.join(name[0],dirname,filename))

Usefull for further segmenting the validation set

for filename in FileList:

	label = int(filename.split('/')[2])

	Im = Image.open(filename)

	pixel = Im.load()

	width, height = Im.size

	for x in range(0,width):
		for y in range(0,height):
			data_image.append(pixel[y,x])

	data_label.append(label) # labels start (one unsigned byte each)

hexval = "{0:#0{1}x}".format(len(FileList),10) # number of files in HEX

# header for label array

header = array('B')
header.extend([0,0,8,1,0,0])
header.append(int('0x' + hexval[2:][:2], 16))
header.append(int('0x' + hexval[4:][:2], 16))
header.append(int('0x' + hexval[6:][:2], 16))
header.append(int('0x' + hexval[8:][:2], 16))

data_label = header + data_label

# additional header for images array

hexval = "{0:#0{1}x}".format(width, 10)  # width in HEX
header.append(int('0x' + hexval[2:][:2], 16))
header.append(int('0x' + hexval[4:][:2], 16))
header.append(int('0x' + hexval[6:][:2], 16))
header.append(int('0x' + hexval[8:][:2], 16))

hexval = "{0:#0{1}x}".format(height, 10)  # height in HEX
header.append(int('0x' + hexval[2:][:2], 16))
header.append(int('0x' + hexval[4:][:2], 16))
header.append(int('0x' + hexval[6:][:2], 16))
header.append(int('0x' + hexval[8:][:2], 16))

if max([width,height]) <= 256:
	header.extend([0,0,0,width,0,0,0,height])
else:
	raise ValueError('Image exceeds maximum size: 256x256 pixels');

header[3] = 3 # Changing MSB for image data (0x00000803)

data_image = header + data_image

output_file = open(name[1] + '-images-rot-idx3-ubyte', 'wb')
data_image.tofile(output_file)
output_file.close()

output_file = open(name[1]+'-labels-rot-idx1-ubyte', 'wb')
data_label.tofile(output_file)
output_file.close()

gzip resulting files

for name in Names:
os.system('gzip '+name[1]+'-images-idx3-ubyte')
os.system('gzip '+name[1]+'-labels-idx1-ubyte')

@hassan5544
Copy link

@KumarLamic why did you change the from 6 to 10 in hexval
I cant reshape my data now do you have a solution

@KumarLamic
Copy link

that was just a random trial and it worked for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants