[Bug] TensorFlow - CUDA: multiprocessing does not work as expected - Dataloader and inference pipeline #1440
Labels
framework: tensorflow
Related to TensorFlow backend
module: transforms
Related to doctr.transforms
type: bug
Something isn't working
Milestone
Bug description
expected the model to run successfully but it is throwing error JIT compliation failed (while running on gpu).
Code snippet to reproduce the bug
model = ocr_predictor(det_arch = 'linknet_resnet18', reco_arch = 'crnn_vgg16_bn', pretrained = True)
img_path = "/home/lubhawan/Downloads/iloveimg-converted/Hospital-Bill-4.jpg" #Specify your image path here
img = DocumentFile.from_images(img_path)
result = model(img)
Error traceback
UnknownError Traceback (most recent call last)
Cell In[5], line 3
1 img_path = "/home/lubhawan/Downloads/iloveimg-converted/Hospital-Bill-4.jpg" #Specify your image path here
2 img = DocumentFile.from_images(img_path)
----> 3 result = model(img)
4 output = result.export()
File ~/.local/lib/python3.11/site-packages/doctr/models/predictor/tensorflow.py:89, in OCRPredictor.call(self, pages, **kwargs)
86 pages = [rotate_image(page, -angle, expand=True) for page, angle in zip(pages, origin_page_orientations)]
88 # Localize text elements
---> 89 loc_preds_dict = self.det_predictor(pages, **kwargs)
90 assert all(
91 len(loc_pred) == 1 for loc_pred in loc_preds_dict
92 ), "Detection Model in ocr_predictor should output only one class"
94 loc_preds: List[np.ndarray] = [list(loc_pred.values())[0] for loc_pred in loc_preds_dict]
File ~/.local/lib/python3.11/site-packages/doctr/models/detection/predictor/tensorflow.py:45, in DetectionPredictor.call(self, pages, **kwargs)
42 if any(page.ndim != 3 for page in pages):
43 raise ValueError("incorrect input shape: all pages are expected to be multi-channel 2D images.")
---> 45 processed_batches = self.pre_processor(pages)
46 predicted_batches = [
47 self.model(batch, return_preds=True, training=False, **kwargs)["preds"] for batch in processed_batches
48 ]
49 return [pred for batch in predicted_batches for pred in batch]
File ~/.local/lib/python3.11/site-packages/doctr/models/preprocessor/tensorflow.py:111, in PreProcessor.call(self, x)
107 batches = [x]
109 elif isinstance(x, list) and all(isinstance(sample, (np.ndarray, tf.Tensor)) for sample in x):
110 # Sample transform (to tensor, resize)
--> 111 samples = list(multithread_exec(self.sample_transforms, x))
112 # Batching
113 batches = self.batch_inputs(samples)
File ~/.local/lib/python3.11/site-packages/doctr/utils/multithreading.py:47, in multithread_exec(func, seq, threads)
42 # Multi-threading
43 else:
44 with ThreadPool(threads) as tp:
45 # ThreadPool's map function returns a list, but seq could be of a different type
46 # That's why wrapping result in map to return iterator
---> 47 results = map(lambda x: x, tp.map(func, seq))
48 return results
File ~/anconda3/lib/python3.11/multiprocessing/pool.py:367, in Pool.map(self, func, iterable, chunksize)
362 def map(self, func, iterable, chunksize=None):
363 '''
364 Apply
func
to each element initerable
, collecting the results365 in a list that is returned.
366 '''
--> 367 return self._map_async(func, iterable, mapstar, chunksize).get()
File ~/anconda3/lib/python3.11/multiprocessing/pool.py:774, in ApplyResult.get(self, timeout)
772 return self._value
773 else:
--> 774 raise self._value
File ~/anconda3/lib/python3.11/multiprocessing/pool.py:125, in worker(inqueue, outqueue, initializer, initargs, maxtasks, wrap_exception)
123 job, i, func, args, kwds = task
124 try:
--> 125 result = (True, func(*args, **kwds))
126 except Exception as e:
127 if wrap_exception and func is not _helper_reraises_exception:
File ~/anconda3/lib/python3.11/multiprocessing/pool.py:48, in mapstar(args)
47 def mapstar(args):
---> 48 return list(map(*args))
File ~/.local/lib/python3.11/site-packages/doctr/models/preprocessor/tensorflow.py:76, in PreProcessor.sample_transforms(self, x)
74 x = tf.image.convert_image_dtype(x, dtype=tf.float32)
75 # Resizing
---> 76 x = self.resize(x)
78 return x
File ~/.local/lib/python3.11/site-packages/doctr/transforms/modules/tensorflow.py:107, in Resize.call(self, img, target)
100 def call(
101 self,
102 img: tf.Tensor,
103 target: Optional[np.ndarray] = None,
104 ) -> Union[tf.Tensor, Tuple[tf.Tensor, np.ndarray]]:
105 input_dtype = img.dtype
--> 107 img = tf.image.resize(img, self.wanted_size, self.method, self.preserve_aspect_ratio)
108 # It will produce an un-padded resized image, with a side shorter than wanted if we preserve aspect ratio
109 raw_shape = img.shape[:2]
File ~/.local/lib/python3.11/site-packages/tensorflow/python/util/traceback_utils.py:153, in filter_traceback..error_handler(*args, **kwargs)
151 except Exception as e:
152 filtered_tb = _process_traceback_frames(e.traceback)
--> 153 raise e.with_traceback(filtered_tb) from None
154 finally:
155 del filtered_tb
File ~/.local/lib/python3.11/site-packages/tensorflow/python/framework/ops.py:5883, in raise_from_not_ok_status(e, name)
5881 def raise_from_not_ok_status(e, name) -> NoReturn:
5882 e.message += (" name: " + str(name if name is not None else ""))
-> 5883 raise core._status_to_exception(e) from None
UnknownError: {{function_node _wrapped__Round_device/job:localhost/replica:0/task:0/device:GPU:0}} JIT compilation failed. [Op:Round] name:
Environment
DocTR version: v0.7.0
TensorFlow version: 2.15.0
PyTorch version: 2.1.2+cu121 (torchvision 0.16.2+cu121)
OpenCV version: 4.9.0
OS: Ubuntu 22.04.3 LTS
Python version: 3.11.5
Is CUDA available (TensorFlow): Yes
Is CUDA available (PyTorch): Yes
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4070
Nvidia driver version: 535.154.05
cuDNN version: Could not collect
Deep Learning backend
is_tf_available: True
is_torch_available: True
The text was updated successfully, but these errors were encountered: