Skip to content

James.Testing.Pdf

Todd Meinershagen edited this page Jul 29, 2016 · 27 revisions

James.Testing.Pdf

James.Testing.Pdf is a library for interacting with .pdf documents/content. It is designed to be simple enough for QA folks to leverage it in their own automation but powerful enough to be leveraged in multi-threaded scenarios such as performance testing.

In order to add it to your projects, you can use the following in the NuGet package manager console.

>install-package James.Testing.Pdf

Source can be found here. The library supports the following features. Click on each link for a description of the supported features. There will be more coming in the future!

Note - As of 1.0.3 release, the library depends on Spire.Pdf instead of iTextSharp. This is due to licensing restrictions related to the old product. Spire.Pdf is a commercial library, but this free version allows working with documents up to 10 pages. If you need to work with documents that are larger, please add an issue so that we can release a separate version for iTextSharp.

Loading Content

All loading of content begins with the Content keyword. It uses a fluent syntax going forward. No need for a novice coder to have to new() up any variables. You can load content by either passing in a file path, byte array, or stream.

Loading Content from File Path

Content
	.From("c:\sample.pdf"))
	.Verify(d => d.IsPdf() == true);

Loading Content from Byte Array

Content
	.From(File.ReadAllBytes("c:\sample.pdf")))
	.Verify(d => d.IsPdf() == true);

Loading Content from File Stream

using (var stream = new FileStream("c:\sample.pdf"))
{
	Content
		.From(stream)
		.Verify(d => d.IsPdf() == true);
}

Verifying Content

Once content has been loaded, you can verify things about that content.

Verifying Document is a PDF

Content
	.From("c:\sample.pdf")
	.Verify(d => d.IsPdf() == true);

Notice that the Verify() method from the James.Testing library is being used to check the IsPdf() method of the content that is returned after loading. You don't have to use the Verify() method. If you want to use another library such as Fluent Assertions, you could write the same thing as follows:

var content = Content
	.From("c:\sample.pdf")
	.VerifyThat(d => d.IsPdf().Should().BeTrue());

For all of the verifications below, you can use the same technique. There is nothing in the James.Testing.Pdf library that forces assertions via the Verify() method.

Verifying Document is a PDF with Version

You can also check for the version of the .pdf document.

Content
	.From("c:\sample.pdf")
	.Verify(d => d.IsPdf(1.0) == true);

Verifying Document Has Number of Pages

You can also check for the number of pages contained within the document.

Content
	.From("c:\sample.pdf")
	.Verify(d => d.Has(2).Pages == true);

Verifying Page Contains Content

You can also check to see if content is contained within a certain page. This content can currently be string or number-based.

Content
	.From("c:\sample.pdf")
	.Verify(d => d.Page(1).Contains("Net Charges"))
	.Verify(d => d.Page(1).Contains(1.25));

Veryifying a Page's Text Property

In addition, you can get access to the Text() of a given page within the document and ask any questions that you could ask of a string.

Content
	.From("c:\sample.pdf")
	.Verify(d => d.Page(1).Text().StartsWith("Net Charges"));

Thread-Level Context for Content

Once you have loaded content, you can refer to that content in other methods by calling the Current() property.

Accessing Content in Multiple Methods

[SetUp]
public void given_existing_file()
{
	Content
		.From("c:\sample.pdf");
}

[Test]
public void when_checking_for_pdf_should_be_true()
{
	Content
		.Current()
		.Verify(d => d.IsPdf() == true);
}

[Test]
public void when_checking_has_3_pages_should_be_true()
{
	Content
		.Current()
		.Verify(d => d.Has(3).Pages() == true);
}

Accessing Content from Multiple Threads

The James.Testing.Pdf library is thread-safe. In the example below, two actions are prepared loading different files. Both actions are executed simultaneously within different threads and can maintain different content. This is particularly useful in performance or load testing scenarios.

Action action1 = () =>
{
	Content
		.From("c:\sample1.pdf");

	Content
		.Current()
		.Verify(d => d.IsPdf() == true);
}

Action action2 = () =>
{
	Content
		.From("c:\sample2.pdf");

	Content
		.Current()
		.Verify(d => d.IsPdf() == true);
}

Parallel.Invoke(action1, action2);