RuntimeError: The size of tensor a (45) must match the size of tensor b (8) at non-singleton dimension 0 #43

hecunren · 2024-03-06T15:23:48Z

Hello, dear @alexandre01 , I am honored to read your work. I tried to use the model you designed to get the latent space vector z. I extracted some of the module interface combinations you designed to extract the latent space z of an SVG image. Code, I found that through CAD conversion to get the svg image to get the latent space vector z, an error of tensor dimension mismatch will appear in the encode interface, and the latent space vector z, code and error message can be obtained by downloading a simple svg icon As follows, I don’t know why this is, can you help me solve this question?
1.The svg image obtained from the CAD conversion that needs to be processed is as follows:

1.The code to extract the latent space vector z is as follows

import os
os.chdir("..")
#%%
from deepsvg.svglib.svg import SVG

from deepsvg import utils
from deepsvg.difflib.tensor import SVGTensor
from deepsvg.svglib.utils import to_gif
from deepsvg.svglib.geom import Bbox
from deepsvg.svgtensor_dataset import SVGTensorDataset, load_dataset, SVGFinetuneDataset
from deepsvg.utils.utils import batchify, linear

import torch
import numpy as np
from torch.utils.data import DataLoader
import torch.nn as nn
device = torch.device("cuda:0"if torch.cuda.is_available() else "cpu")
from IPython.display import display




#%%
pretrained_path = "./pretrained/hierarchical_ordered.pth.tar"
from configs.deepsvg.hierarchical_ordered import Config
print(0)
cfg = Config()
cfg.model_cfg.dropout = 0.  # for faster convergence
model = cfg.make_model().to(device)
utils.load_model(pretrained_path, model)
model.eval();
dataset = load_dataset(cfg)
print(1)
def encode(data):
    model_args = batchify((data[key] for key in cfg.model_args), device)
    with torch.no_grad():
        z = model(*model_args, encode_mode=True)
        return z

def encode_svg(svg):
    data = dataset.get(svg=svg)
    return encode(data)
def load_svg(filename):
    svg = SVG.load_svg(filename)
    svg = dataset.simplify(svg)
    svg = dataset.preprocess(svg, mean=True)
    return svg
print(2)
lego = load_svg("C:/Users/15653/deepsvg/dataset/test/kuangkuang.svg")


print(3)
print(lego)
z = encode_svg(lego)

print(z)
2.The code where the error occurs is as follows

import math
import torch
import torch.nn as nn


class PositionalEncodingSinCos(nn.Module):
    def __init__(self, d_model, dropout=0.1, max_len=250):
        super(PositionalEncodingSinCos, self).__init__()
        self.dropout = nn.Dropout(p=dropout)

        pe = torch.zeros(max_len, d_model)
        position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
        div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
        pe[:, 0::2] = torch.sin(position * div_term)
        pe[:, 1::2] = torch.cos(position * div_term)
        pe = pe.unsqueeze(0).transpose(0, 1)
        self.register_buffer('pe', pe)

    def forward(self, x):
        x = x + self.pe[:x.size(0), :]
        return self.dropout(x)


class PositionalEncodingLUT(nn.Module):

    def __init__(self, d_model, dropout=0.1, max_len=250):
        super(PositionalEncodingLUT, self).__init__()
        self.dropout = nn.Dropout(p=dropout)

        position = torch.arange(0, max_len, dtype=torch.long).unsqueeze(1)
        self.register_buffer('position', position)

        self.pos_embed = nn.Embedding(max_len, d_model)

        self._init_embeddings()

    def _init_embeddings(self):
        nn.init.kaiming_normal_(self.pos_embed.weight, mode="fan_in")

    def forward(self, x):
        pos = self.position[:x.size(0)]
        x = x + self.pos_embed(pos)
        return self.dropout(x)

yxxshin · 2024-03-06T15:51:06Z

Same here. I came to search and stumbled upon this question, so I'll add mine below. Actually, by using only the original codes from latent_ops.ipynb (or somewhere else) and datasets given, I faced this similar error.

For example, when I ran this:

z = encode_icon("0")

I faced this Error; RuntimeError: The size of tensor a (10) must match the size of tensor b (8) at non-singleton dimension 0

As another example, when I ran this:

z = encode_icon("8")

I faced this Error; RuntimeError: stack expects each tensor to be equal size, but got [106] at entry 0 and [32] at entry 1

But for most of the cases, this works perfectly.

I'm trying to figure this out too, so let's let each other know when you do!

hecunren · 2024-03-07T02:54:22Z

I am very happy to see your reply. When encoding the latent space vector z, most of the svg icons downloaded from the Internet can successfully obtain the latent space vector z, but these icons are composed of simple svg path commands. When I try I got the latent space vector of the chip design drawing in CAD in the form of svg, but always encountered the problem of tensor mismatch. I used dataset.precess to preprocess the pictures I processed as explained by the author, and I can get the simplified The svg diagram is located in svgs_simplified, as well as the svg_meta.csv file. I observed the svg_meta.csv file and found that the total_len and nb_groups of the chip design diagram are very large, reaching tens of millions, and the total_len obtained by the icon of the latent space vector z can be obtained. nb_groups is very small, only within 100. I initially thought that the CAD drawing was too complex to handle, because I noticed that the configuration file of the model given by the author has provisions for total_len, nb_groups initialization, but I tried to extract a part of the CAD to try to obtain the potential The space vector z also fails. When you try a simple number "8", you also get a tensor mismatch problem, which may cause a tensor mismatch problem. It has nothing to do with total_len and nb_groups being very large.
1.The svg icon that can successfully obtain the latent space vector z

2.Preprocessing methods provided by the author

python -m dataset.preprocess --data_folder dataset/svgs/ --output_folder dataset/svgs_simplified/ --output_meta_file dataset/svg_meta.csv

3.Trying to extract part of the CAD drawing to get the latent space z still fails with total_len, nb_groups

hecunren closed this as completed May 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: The size of tensor a (45) must match the size of tensor b (8) at non-singleton dimension 0 #43

RuntimeError: The size of tensor a (45) must match the size of tensor b (8) at non-singleton dimension 0 #43

hecunren commented Mar 6, 2024 •

edited

yxxshin commented Mar 6, 2024

hecunren commented Mar 7, 2024 •

edited

RuntimeError: The size of tensor a (45) must match the size of tensor b (8) at non-singleton dimension 0 #43

RuntimeError: The size of tensor a (45) must match the size of tensor b (8) at non-singleton dimension 0 #43

Comments

hecunren commented Mar 6, 2024 • edited

yxxshin commented Mar 6, 2024

hecunren commented Mar 7, 2024 • edited

hecunren commented Mar 6, 2024 •

edited

hecunren commented Mar 7, 2024 •

edited