Training issues #95

lzbushicai · 2024-01-29T15:22:13Z

Hello，Cao Anh Quan！！, I organized my data format into KITTI format and trained it, but I found that the relationship loss often appears as nan.
sequences:['01'],frame_id:['002205'],**relation_loss: nan** sequences:['01'],frame_id:['002205'],loss_sem_scal: 5.660839080810547 sequences:['01'],frame_id:['002205'],loss_geo_scal: 2.1648528575897217 Warning: frustum_ nonempty is zero, and the division operation will be skipped or assigned a default value. sequences是:['01'],frame_id为:['002205'],**total_loss: nan**
I guess this may be related to my data (I did not change the code for calculating the loss). In my tags, there are a particularly large number of 255, as shown in the following figure. May I ask if there is any special meaning or handling method for 255 in your design?

my dataset label distribution

Another issue is that due to the large size of my img, I do not have enough memory to train the network. Therefore, I resized the image to (450 * 720) here is my code ,
img = np.array(img, dtype=np.float32, copy=False) / 255.0 img_resized = Image.fromarray((img * 255).astype(np.uint8)).resize((720, 450)) img = img[:450, :720, :] # crop image
I'm unsure whether this will substantially influence the outcomes of my training, and I would greatly appreciate your insights and guidance on this matter.

The text was updated successfully, but these errors were encountered:

anhquancao · 2024-01-31T09:31:04Z

Hi @lzbushicai,

The 255 is automatically ignored in all the losses so I don't think they are the problem.
I suggest you to check it the loss is nan on some specific frames. Maybe there are problems with the labels in these frames.
When you resize your image, you need to change the cam_K accordingly.
https://dsp.stackexchange.com/questions/6055/how-does-resizing-an-image-affect-the-intrinsic-camera-matrix

lzbushicai · 2024-02-18T02:47:10Z

Thank you for your suggestion. After resizing the image, I did not adjust the camera parameters.

lzbushicai · 2024-02-26T15:11:34Z

Hi!!Cao Anh Quan！！Hello, there are always many bugs in the training process. I re-examined the problem and found that there was always a problem in visualizing the voxel I made using the official kitti tool. When using the official kitti tool,, I fir
/st converted the point cloud .bin file and label .label file. , processed into fixed-size .bin .label .invalid .occloud files. However, some sequences are always wrong. it is the problem of the lidar system in calib.txt turning to camera external parameters, but I can't get the correct result using the transformation matrix of my data set. Here are some error results：

So I followed your suggestion and wrote a script to process the point cloud data .bin file and label data .label file. However, in this process, I did not use the camera’s internal and external parameters calib.txt and the pose of each scan. pose.txt, do I need to add any additional information to train the data processed by my script?Below is my script；

import numpy as np

# Load point cloud data and labels
points = np.fromfile(rellis_bin, dtype=np.float32).reshape(-1, 4)[:, :3]
labels = np.fromfile(rellis_label, dtype=np.uint32).astype(np.float32)

# Define the bounds and size of the voxels
x_min, x_max, y_min, y_max, z_min, z_max = 0, 51.2, -25.6, 25.6, -2, 4.4
voxel_size = 0.2

# Calculate the number of voxels along each axis
x_voxels = int((x_max - x_min) / voxel_size)
y_voxels = int((y_max - y_min) / voxel_size)
z_voxels = int((z_max - z_min) / voxel_size)

# Initialize the voxel label array
voxel_labels = np.zeros((x_voxels, y_voxels, z_voxels), dtype=np.uint32)

# Compute the voxel index for each point
indices = np.floor((points - np.array([x_min, y_min, z_min])) // voxel_size).astype(np.int32)

# Assign labels to each voxel
for i, (x, y, z) in enumerate(indices):
    if 0 <= x < x_voxels and 0 <= y < y_voxels and 0 <= z < z_voxels:
        label = labels[i]  # Corrected variable name for clarity
        voxel_labels[x, y, z] = label  # Modify here to implement label statistics and selection logic

# Save the voxel_labels array, which contains the label for each voxel; unassigned voxels have a label of 0
np.save("voxel_00_82.npy", voxel_labels)

anhquancao · 2024-02-27T20:22:32Z

Hello, your voxelization code appears to be correct. However, have you verified that your x, y, z axes align with those of Semantic KITTI?

Without using camera parameters, it's unclear how you would be able to aggregate the point cloud to construct a dense scene.

You will still require the calibration parameters to project the voxels onto the image.

lzbushicai · 2024-02-29T06:41:27Z

Hello, your voxelization code appears to be correct. However, have you verified that your x, y, z axes align with those of Semantic KITTI?

Without using camera parameters, it's unclear how you would be able to aggregate the point cloud to construct a dense scene.

You will still require the calibration parameters to project the voxels onto the image.

Thanks, I am projecting the point cloud with labels into the image to troubleshoot the problem. Check if the problem is with the KITTI script or with the Extrinsic Parameters

lzbushicai · 2024-03-05T15:15:42Z

Hi, I'm using my own script to voxelize the point cloud, but the points I get are particularly sparse (relative to the KITTI tool), can I aggregate the front and back frames to solve this problem?
Here is the label distribution after processing by my script

   Label    Count
0      0  2092045
1      3     1120
2      4     1804
3     11     1832
4     17       31
5     18       52
6     19      268

Here is the label distribution after processing by KITTI tool

  Label    Count
0    0.0   148154
1    3.0    66652
2    4.0   135036
3   17.0      907
4   18.0     3499
5   19.0    12389
6  255.0  1730515

it's obvious that the file processed by my own script is very sparse.
because the script processed by the kitti tool is problematic, and I don't think I can train it to produce good results due to the sparse points.

anhquancao · 2024-03-06T14:49:47Z

I think you need to aggregate the point clouds from many consecutive frames to have dense scenes not only the front and back.

lzbushicai · 2024-03-16T10:18:58Z

Hi!!Cao Anh Quan！
How long did it take you to train the KITTI model? Since my prediction categories went from 20 to 15, I couldn't use your pre-trained model, my model converged very slowly and had very low metrics。

epoch=000-val/:
'mIoU=0.01065.ckpt'
epoch=001-val/:
'mIoU=0.01568.ckpt'
epoch=002-val/:
'mIoU=0.01592.ckpt'
epoch=003-val/:
'mIoU=0.01336.ckpt'
epoch=004-val/:
'mIoU=0.01378.ckpt'
epoch=005-val/:
'mIoU=0.01774.ckpt'
epoch=006-val/:
'mIoU=0.01647.ckpt'
epoch=007-val/:
'mIoU=0.01532.ckpt'

sem_scal and geo_scal loss function don't converge after 32epochs，
From a top view perspective, my voxels are shown in the following image

Is this a problem with my data?

anhquancao · 2024-04-05T06:10:44Z

Hi, it takes 30 epochs to converge (https://github.com/astra-vision/MonoScene/blob/master/monoscene/scripts/train_monoscene.py#L50).

You should try overfitting a single example to see if your network can converge. Usually, this problem arises from an incorrect camera projection of voxels onto the image.

You should also visualize the projected voxels on the image to check if they are correct.

lzbushicai · 2024-04-07T06:59:11Z

Thank you！！！I tried overlaying 10 frames of point clouds and used surroundocc's pipline to generate dense labels, but the training results were still very poor.
What impact will it have on the results if the data is slightly sparse?
I visualized my voxels and could recognize the scene in the photo, but I seemed to be very confused. It is difficult to find a verification method to confirm whether the voxels are actually projected onto the corresponding image.

lzbushicai · 2024-04-08T13:00:45Z

Hi，Cao Anh Quan！
I changed my data format to kitti, but the network does not seem to be working. The estimated index miou of the network fluctuates between 1% and 2%. When I generate voxels, the coordinates correspond to the one that generates foc_mask. So I think my points are projected onto the image, but I don't know where the problem is

anhquancao · 2024-04-09T14:44:58Z

Hi, you should sample 1 point per voxels in 3D space, then project them on the image using the camera parameter to visualize them. This is the only way to know if your projection is correct. Otherwise, you can visualize pix_x and pix_y here to see if they are correct.

Then, you should try to overfit on single example to see if everything is working. You should expect a very high overfit performance.

lzbushicai · 2024-04-09T14:54:26Z

Thank you. I printed the fovmask, pix, and target on a blank image, and the result showed that I did not have the correct projection,

def vox2pix(pix,target,color_map_bgr,out_dir,frame_id):
    image = np.zeros((450, 720, 3), dtype=np.uint8)
    for point, color_key in zip(pix, target.flatten()):
        # get the color in labels
        color = color_map_bgr.get(int(color_key), red_color)
        if color == red_color:  
            print(f"Warning: Color for target value {color_key} not found. Using red instead.")
        cv2.circle(image, (int(point[0]), int(point[1])), radius=2, color=color, thickness=-1)
  
    # cv2.imshow('Image with Colored Points', image)
    # cv2.waitKey(0)
    # cv2.destroyAllWindows()
    imagefilename = frame_id + ".jpg"
    cv2.imwrite(os.path.join(out_dir, imagefilename), image)

The result is shown in the following [](https://www.wolai.com/ts89qCrBurTcyVTE8L4Zqb)
Thank you again！！！If it weren't for you, I wouldn't have noticed the problem

anhquancao · 2024-04-11T12:59:21Z

You are welcome! It also took me quite sometime to make the projection works correctly.

lzbushicai · 2024-04-17T12:19:13Z

Hi，Cao Anh Quan！I have sloved this problem,Without your foundation, it's hard to imagine how long it would have taken me to solve this problem，Thankyou !!!!

lzbushicai · 2024-04-17T16:28:00Z

Hi Cao Anh Quan！I have processed my data into KITTI format and now I am sure that my voxels are correctly projected onto the image. Why is my training loss unable to converge and my MIOU almost zero? I cannot find the reason why the network is not working. Even if my dataset is a field dataset, the effect may be slightly worse, but it shouldn't be working, right?

anhquancao · 2024-04-18T12:34:47Z

Have you tried to optimized only 1 frame to see if you can overfit it?

lzbushicai · 2024-04-18T12:49:14Z

Yes, I trained 10 frames of data using Monoscene, but the loss did not converge and the moiu was very low

anhquancao · 2024-04-19T07:30:32Z

I suggest you to visualize the output.
Also, you can try to train using on cross entropy loss .

lzbushicai · 2024-04-19T08:12:32Z

I suggest you to visualize the output. Also, you can try to train using on cross entropy loss .

I set other parameters in monoscene.yaml

relation_loss: false 
CE_ssc_loss: true
sem_scal_loss: false
geo_scal_loss: false

but the loss stills constant。

lzbushicai · 2024-04-24T05:08:39Z

Hi Cao Anh Quan！I have solved the above problem. Thank you for your long-term guidance。The network works well in my dataset!!!

anhquancao closed this as completed May 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training issues #95

Training issues #95

lzbushicai commented Jan 29, 2024 •

edited

anhquancao commented Jan 31, 2024

lzbushicai commented Feb 18, 2024

lzbushicai commented Feb 26, 2024 •

edited

anhquancao commented Feb 27, 2024

lzbushicai commented Feb 29, 2024

lzbushicai commented Mar 5, 2024

anhquancao commented Mar 6, 2024

lzbushicai commented Mar 16, 2024 •

edited

anhquancao commented Apr 5, 2024

lzbushicai commented Apr 7, 2024

lzbushicai commented Apr 8, 2024

anhquancao commented Apr 9, 2024

lzbushicai commented Apr 9, 2024 •

edited

anhquancao commented Apr 11, 2024

lzbushicai commented Apr 17, 2024

lzbushicai commented Apr 17, 2024

anhquancao commented Apr 18, 2024

lzbushicai commented Apr 18, 2024

anhquancao commented Apr 19, 2024

lzbushicai commented Apr 19, 2024 •

edited

lzbushicai commented Apr 24, 2024

Training issues #95

Training issues #95

Comments

lzbushicai commented Jan 29, 2024 • edited

anhquancao commented Jan 31, 2024

lzbushicai commented Feb 18, 2024

lzbushicai commented Feb 26, 2024 • edited

anhquancao commented Feb 27, 2024

lzbushicai commented Feb 29, 2024

lzbushicai commented Mar 5, 2024

anhquancao commented Mar 6, 2024

lzbushicai commented Mar 16, 2024 • edited

anhquancao commented Apr 5, 2024

lzbushicai commented Apr 7, 2024

lzbushicai commented Apr 8, 2024

anhquancao commented Apr 9, 2024

lzbushicai commented Apr 9, 2024 • edited

anhquancao commented Apr 11, 2024

lzbushicai commented Apr 17, 2024

lzbushicai commented Apr 17, 2024

anhquancao commented Apr 18, 2024

lzbushicai commented Apr 18, 2024

anhquancao commented Apr 19, 2024

lzbushicai commented Apr 19, 2024 • edited

lzbushicai commented Apr 24, 2024

lzbushicai commented Jan 29, 2024 •

edited

lzbushicai commented Feb 26, 2024 •

edited

lzbushicai commented Mar 16, 2024 •

edited

lzbushicai commented Apr 9, 2024 •

edited

lzbushicai commented Apr 19, 2024 •

edited