Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AppProperties.Pages() doesn't always return the correct number of pages #496

Open
lpinto-ripcord opened this issue Jul 12, 2023 · 2 comments

Comments

@lpinto-ripcord
Copy link

Description

I'm trying to get the page count of several office documents. However I've noticed that not all of them return the correct page count.

It seems that in the documentation it mentions "Pages returns total number of pages which are saved by the text editor which produced the document. For unioffice created documents, it is 0.", could this be related to that?

If so, which method can I use to get a consistent correct count?

Expected Behavior

Document "2pages.docx" should print 2
Document "9pages.docx" should print 9

Actual Behavior

Document "2pages.docx" prints 0
Document "9pages.docx" prints 3

Code and Documents

2pages.docx
9pages.docx

package main

import (
	"fmt"
	"log"

	"github.com/unidoc/unioffice/common/license"
	"github.com/unidoc/unioffice/document"
)

func init() {
	// Make sure to load your metered License API key prior to using the library.
	// If you need a key, you can sign up and create a free one at https://cloud.unidoc.io
	err := license.SetMeteredKey("...")
	if err != nil {
		panic(err)
	}
}

func main() {
	docPath := "./Pages.docx"

	doc, err := document.Open(docPath)
	if err != nil {
		log.Fatalf("Error opening the document: %v", err)
	}

	pageCount := doc.AppProperties.Pages()
	fmt.Println("Page count:", pageCount)
}

@github-actions
Copy link

Welcome! Thanks for posting your first issue. The way things work here is that while customer issues are prioritized,
other issues go into our backlog where they are assessed and fitted into the roadmap when suitable.
If you need to get this done, consider buying a license which also enables you to use it in your commercial products.
More information can be found on https://unidoc.io/

@3ace
Copy link

3ace commented Apr 3, 2024

Hi @lpinto-ripcord

The issue with getting the actual page number is because it could only properly calculated when opening the document from an app such as MS Word or others because the document file itself didn't contain any kind of info related to pages other than the one available in the document properties which is returned by AppProperties.Pages method.

To work around this, in UniOffice v1.31.0 that released recently we introduced an experimental function utils.GetNumPages to calculate the page number by converting the DOCX file into a PDF document and getting the page count from there.

You could use the new method just like this:

doc, err := document.Open("9pages.docx")
if err != nil {
	log.Fatalf("error opening document: %s", err)
}
defer doc.Close()

fmt.Println("Total number of pages in the document from properties:", doc.AppProperties.Pages())

actualCount, err := utils.GetNumPages(doc)
if err != nil {
	log.Fatalf("error calculating page count: %s", err)
}

fmt.Println("Total number of pages in the document from calculation:", actualCount)

doc.AppProperties.SetPages(int32(actualCount))

fmt.Println("Total number of pages in the document from properties:", doc.AppProperties.Pages())

This method is currently marked as an experimental because the conversion result itself probably produces an incorrect result.

Additionally, we also introduced a new method AppProperties.SetPages to overwrite the page count in the document properties.

If by some chances the result of utils.GetNumPages method is not suitable for your case, I would suggest to manually convert the DOCX to PDF by yourself using any other means and then use UniPDF to get the page count.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants