Passing gray scale images

#1
by shossain - opened

Hi,
I copied the inference code and passed a grayscale image to the feature extractor. The image looks like this:
<PIL.PngImagePlugin.PngImageFile image mode=L size=1550x2200 at 0x7FACBFF7DE80>

But, I am getting the following error:

transformers/image_utils.py", line 114, in infer_channel_dimension_format
raise ValueError(f"Unsupported number of image dimensions: {image.ndim}")
ValueError: Unsupported number of image dimensions: 2

Please suggest.

I am stacking the gray channel 3 times to get around the issue.

shossain changed discussion status to closed

@shossain Can you share your solution?

I ran into the same issue, here is how I stacked the grayscale image:

    pix = fitz.Pixmap(pdf_file, image[0])
    img = Image.frombytes('L', [pix.width, pix.height], pix.samples)
    # stack the grayscale image to give it three dimensions
    stacked_img = Image.merge('RGB', (img, img, img))

Sign up or log in to comment