Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster RCNN - InvalidArgumentError: Value for attr 'N' of 0 must be at least minimum 2 #22

Open
jcburnel opened this issue Aug 21, 2019 · 8 comments

Comments

@jcburnel
Copy link

hello, running with no change in code got me an error in inspect_model notebook,

this line:

detections_list = model.bbox_head.get_bboxes(
rcnn_probs_list, rcnn_deltas_list, rois_list, batch_metas)

got this error:

InvalidArgumentError: Value for attr 'N' of 0 must be at least minimum 2
; NodeDef: {{node ConcatV2}}; Op<name=ConcatV2; signature=values:N*T, axis:Tidx -> output:T; attr=N:int,min=2; attr=T:type; attr=Tidx:type,default=DT_INT32,allowed=[DT_INT32, DT_INT64]> [Op:ConcatV2] name: concat

using python 3.7 and tf 2.0.0-beta1

Repository owner deleted a comment from jcburnel Nov 18, 2019
@c-herring
Copy link

I also am getting this issue. Looks like the network predicts all classes to be the zeroth class. So network is not predicting anything.

So, in _get_bboxes_single() it first runs
class_ids = tf.argmax(rcnn_probs, axis=1, output_type=tf.int32)

which produces an array of zeros

then:

keep = tf.where(class_ids > 0)[:, 0]

so keep is of length zero and eventually we try to fun tf.concat on an array of length zero which produces the error.

I am just trying to overfit from a randomly initialized network, since I cannot download the weights (drive link is dead and I cannot download from baidu).

Perhaps the network is behaving correctly? It has just convered to predict nothing, however it manifests in a corner case that is not caught by the code (ie no class proposals at all)

@jcburnel
Copy link
Author

I have downloaded the weights and have the same behaviour.

@c-herring
Copy link

Ah interesting, so it is predicting nothing even when you have trained weights?

I wonder if you could possibly share the weights? I just tried for a couple of hours to try and get a Baidu account but without Chinese phone number it seems impossible :(

@jcburnel
Copy link
Author

I'll try to share it somehow today.

In fact it did predict something if you don't train it again and try it on some example, it is during the training that we got some gradients error, and after that the network can't "recover" (I'm sorry for the lack of explanation, It's been a 'long' time since I tried it)

@c-herring
Copy link

Thanks that explanation makes sense. At least a good place for me to start debugging anyway :)

That would be really awesome if you could.
If you have a google account you can share it to my google drive here: https://drive.google.com/drive/folders/1jdY9u3YHWiuGbtNX7N8I6iVYpBL1kRZj?usp=sharing

@hehongjie
Copy link

Hi, have you solved the problem? I am also stopped here.

@yunfei1999
Copy link

我遇到了这个问题?
Traceback (most recent call last):
File "E:/obeject_detection/fasterRCNN/train_model.py", line 50, in
_ = model((batch_imgs, batch_metas), training=False)
File "E:\Anaconda3\envs\tf-gpu2.7.0\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "E:\obeject_detection\fasterRCNN\detection\models\detectors\faster_rcnn.py", line 148, in call
detections_list = self.bbox_head.get_bboxes(
File "E:\obeject_detection\fasterRCNN\detection\models\bbox_heads\bbox_head.py", line 117, in get_bboxes
detections_list = [
File "E:\obeject_detection\fasterRCNN\detection\models\bbox_heads\bbox_head.py", line 118, in
self._get_bboxes_single(
File "E:\obeject_detection\fasterRCNN\detection\models\bbox_heads\bbox_head.py", line 187, in _get_bboxes_single
nms_keep = tf.concat(nms_keep, axis=0)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Exception encountered when calling layer "faster_rcnn" (type FasterRCNN).

OpKernel 'ConcatV2' has constraint on attr 'T' not in NodeDef '[N=0, Tidx=DT_INT32]', KernelDef: 'op: "ConcatV2" device_type: "GPU" constraint { name: "T" allowed_values { list { type: DT_UINT64 } } } host_memory_arg: "axis"' [Op:ConcatV2] name: concat

Call arguments received:
• inputs=('tf.Tensor(shape=(1, 1216, 1216, 3), dtype=float32)', 'tf.Tensor(shape=(1, 11), dtype=float32)')
• training=False

@yunfei1999
Copy link

when i run train_model.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants