Repeated images are written several times #2680

mustache1up · 2024-05-05T04:06:45Z

When using the same image in the document, each instance of the image results in a different file inside the generated docx file.

Documents with reused images gets linearly bigger.

Minimum example:

import { Document, ImageRun, Packer, Paragraph } from "docx";

const imageBase64Data = "iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAAFUlEQVR42mP8z8BQz0AEYBxVSF+FABJADveWkH6oAAAAAElFTkSuQmCC";

const doc = new Document({
    sections: [
        {
            children: [
                new Paragraph({
                    children: [
                        new ImageRun({
                            data: imageBase64Data, // image
                            transformation: {
                                width: 100,
                                height: 100,
                            },
                        }),
                    ],
                }),
                new Paragraph({
                    children: [
                        new ImageRun({
                            data: imageBase64Data, // same image
                            transformation: {
                                width: 200,
                                height: 200,
                            },
                        }),
                    ],
                }),
            ],
        },
    ],
});

Packer.toBuffer(doc).then((buffer) => {
    fs.writeFileSync("same_image_twice.docx", buffer); // two identical media files inside the docx
});

mustache1up · 2024-05-05T04:11:42Z

We could use the git uses for a while now, using a digest of the file as uniqueId of the image.

Git uses SHA1, I'll test if it work well in the ImageRun class.

mustache1up · 2024-05-05T07:49:11Z

Seems to work great. Adding tests to spec files in order to open a PR.

This was referenced May 5, 2024

Reuse images: change ImageRun keys to be based on image data content #2681

Open

Allow to specify image fileName and emu dimensions #849

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repeated images are written several times #2680

Repeated images are written several times #2680

mustache1up commented May 5, 2024 •

edited

mustache1up commented May 5, 2024

mustache1up commented May 5, 2024

Repeated images are written several times #2680

Repeated images are written several times #2680

Comments

mustache1up commented May 5, 2024 • edited

mustache1up commented May 5, 2024

mustache1up commented May 5, 2024

mustache1up commented May 5, 2024 •

edited