Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: The size of tensor a (45) must match the size of tensor b (8) at non-singleton dimension 0 #43

Closed
hecunren opened this issue Mar 6, 2024 · 2 comments

Comments

@hecunren
Copy link

hecunren commented Mar 6, 2024

Hello, dear @alexandre01 , I am honored to read your work. I tried to use the model you designed to get the latent space vector z. I extracted some of the module interface combinations you designed to extract the latent space z of an SVG image. Code, I found that through CAD conversion to get the svg image to get the latent space vector z, an error of tensor dimension mismatch will appear in the encode interface, and the latent space vector z, code and error message can be obtained by downloading a simple svg icon As follows, I don’t know why this is, can you help me solve this question?
1.The svg image obtained from the CAD conversion that needs to be processed is as follows:
kuangkuang
kuangkuang

1.The code to extract the latent space vector z is as follows

import os
os.chdir("..")
#%%
from deepsvg.svglib.svg import SVG

from deepsvg import utils
from deepsvg.difflib.tensor import SVGTensor
from deepsvg.svglib.utils import to_gif
from deepsvg.svglib.geom import Bbox
from deepsvg.svgtensor_dataset import SVGTensorDataset, load_dataset, SVGFinetuneDataset
from deepsvg.utils.utils import batchify, linear

import torch
import numpy as np
from torch.utils.data import DataLoader
import torch.nn as nn
device = torch.device("cuda:0"if torch.cuda.is_available() else "cpu")
from IPython.display import display




#%%
pretrained_path = "./pretrained/hierarchical_ordered.pth.tar"
from configs.deepsvg.hierarchical_ordered import Config
print(0)
cfg = Config()
cfg.model_cfg.dropout = 0.  # for faster convergence
model = cfg.make_model().to(device)
utils.load_model(pretrained_path, model)
model.eval();
dataset = load_dataset(cfg)
print(1)
def encode(data):
    model_args = batchify((data[key] for key in cfg.model_args), device)
    with torch.no_grad():
        z = model(*model_args, encode_mode=True)
        return z

def encode_svg(svg):
    data = dataset.get(svg=svg)
    return encode(data)
def load_svg(filename):
    svg = SVG.load_svg(filename)
    svg = dataset.simplify(svg)
    svg = dataset.preprocess(svg, mean=True)
    return svg
print(2)
lego = load_svg("C:/Users/15653/deepsvg/dataset/test/kuangkuang.svg")


print(3)
print(lego)
z = encode_svg(lego)

print(z)
2.The code where the error occurs is as follows
import math
import torch
import torch.nn as nn


class PositionalEncodingSinCos(nn.Module):
    def __init__(self, d_model, dropout=0.1, max_len=250):
        super(PositionalEncodingSinCos, self).__init__()
        self.dropout = nn.Dropout(p=dropout)

        pe = torch.zeros(max_len, d_model)
        position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
        div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
        pe[:, 0::2] = torch.sin(position * div_term)
        pe[:, 1::2] = torch.cos(position * div_term)
        pe = pe.unsqueeze(0).transpose(0, 1)
        self.register_buffer('pe', pe)

    def forward(self, x):
        x = x + self.pe[:x.size(0), :]
        return self.dropout(x)


class PositionalEncodingLUT(nn.Module):

    def __init__(self, d_model, dropout=0.1, max_len=250):
        super(PositionalEncodingLUT, self).__init__()
        self.dropout = nn.Dropout(p=dropout)

        position = torch.arange(0, max_len, dtype=torch.long).unsqueeze(1)
        self.register_buffer('position', position)

        self.pos_embed = nn.Embedding(max_len, d_model)

        self._init_embeddings()

    def _init_embeddings(self):
        nn.init.kaiming_normal_(self.pos_embed.weight, mode="fan_in")

    def forward(self, x):
        pos = self.position[:x.size(0)]
        x = x + self.pos_embed(pos)
        return self.dropout(x)

error

@yxxshin
Copy link

yxxshin commented Mar 6, 2024

Same here. I came to search and stumbled upon this question, so I'll add mine below. Actually, by using only the original codes from latent_ops.ipynb (or somewhere else) and datasets given, I faced this similar error.

For example, when I ran this:

z = encode_icon("0")

I faced this Error; RuntimeError: The size of tensor a (10) must match the size of tensor b (8) at non-singleton dimension 0

Screenshot 2024-03-07 at 12 43 44 AM

As another example, when I ran this:

z = encode_icon("8")

I faced this Error; RuntimeError: stack expects each tensor to be equal size, but got [106] at entry 0 and [32] at entry 1

Screenshot 2024-03-07 at 12 44 14 AM

But for most of the cases, this works perfectly.
Screenshot 2024-03-07 at 12 45 18 AM

I'm trying to figure this out too, so let's let each other know when you do!

@hecunren
Copy link
Author

hecunren commented Mar 7, 2024

I am very happy to see your reply. When encoding the latent space vector z, most of the svg icons downloaded from the Internet can successfully obtain the latent space vector z, but these icons are composed of simple svg path commands. When I try I got the latent space vector of the chip design drawing in CAD in the form of svg, but always encountered the problem of tensor mismatch. I used dataset.precess to preprocess the pictures I processed as explained by the author, and I can get the simplified The svg diagram is located in svgs_simplified, as well as the svg_meta.csv file. I observed the svg_meta.csv file and found that the total_len and nb_groups of the chip design diagram are very large, reaching tens of millions, and the total_len obtained by the icon of the latent space vector z can be obtained. nb_groups is very small, only within 100. I initially thought that the CAD drawing was too complex to handle, because I noticed that the configuration file of the model given by the author has provisions for total_len, nb_groups initialization, but I tried to extract a part of the CAD to try to obtain the potential The space vector z also fails. When you try a simple number "8", you also get a tensor mismatch problem, which may cause a tensor mismatch problem. It has nothing to do with total_len and nb_groups being very large.
1.The svg icon that can successfully obtain the latent space vector z
rain
2.Preprocessing methods provided by the author

python -m dataset.preprocess --data_folder dataset/svgs/ --output_folder dataset/svgs_simplified/ --output_meta_file dataset/svg_meta.csv

3.Trying to extract part of the CAD drawing to get the latent space z still fails with total_len, nb_groups
kuangkuang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants