Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not assume that index length and padding are always similar while merging #22

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

buddly27
Copy link

@buddly27 buddly27 commented Apr 5, 2016

I'm a bit confused with the usage of padding 0 and padding 1. In all my code I always assume that "%d" == "%01d" as it produces the same result, but you seem to have a different behaviour for both cases.

For instance, when it comes to match the members of a collection:

import clique

collection1 = clique.parse("my_sequence.%01d.exr [1-10]")
collection1.match('my_sequence.1.exr') # OK
collection1.match('my_sequence.10.exr') # Does not match...

collection2 = clique.parse("my_sequence.%d.exr [1-10]")
collection2.match('my_sequence.1.exr') # OK
collection2.match('my_sequence.100.exr') # OK

I do understand that the 1-padded collection is considered as a padded collection, and that seems to imply an implicit boundary which is defined by the length of the padding. But this seems odd to me as well... It seems more of an interpretation than a technical fact.

Especially because the item to match is already a members of an automatically generated collection!

import clique
collection = clique.parse("my_sequence.%03d.exr [990-1000]")
for member in  collection:
    collection.match(member) # Will not match at frame 1000...

I think it should be safer to imply that the index is not always equal to the padding length, and do the length comparison only if a "0" is found at the beginning of the index

if padded:
    if len(index) != self.padding:
        return None

Obviously this would change the behaviour of the "add" method as well as unpadded item outside padding bounds would now be accepted... but in the latest example, the "my_sequence.1000.exr" item has been already generated while creating the collection, so I don't think it is an issue...

What do you think?

@martinpengellyphillips
Copy link
Contributor

Practically, %01d corresponds to %d, but to me they are conceptually different - one represents a desired string width of 1 character whilst the other represents no target width at all.

You can see a further aspect of this distinction by considering the logic of:

>>> print '{:02d}'.format(1)
01
>>> print '{:02d}'.format(-1)
-1

I'll need to consider this more, but my gut says that the real bug is that your test case example parses at all as it contains contradicting information:

collection = clique.parse("my_sequence.%03d.exr [990-1000]")

Effectively you have specified a sequence that should be padded up to a width of 3, but contains indexes that do not conform to that max width, which I am tempted to call an invalid specification.

On the other hand I do understand your point about the whole padding thing not being well defined (anywhere!).

I wonder how your proposed changes would affect the assemble logic which tries to merge collections on padding boundaries.

@buddly27
Copy link
Author

buddly27 commented Apr 5, 2016

I understand the concept, seeing the padding as a boundary does make sense. Maybe you would prefer to take the issue on the other way and make the boundary compulsory at the sequence creation? So that these creation become impossible:

collection = clique.parse("my_sequence.%03d.exr [990-1000]")
collection = clique.Collection(
    head="my_sequence.", padding=3, tail='.exr', 
    index=range(990,1001)
)

Because I see very often "%01d" for unpadded sequences, a bit of pedagogy will be needed! Also I had to modify some code to be less picky on sequence matching because of 3-padded sequences going over 1000...

The issue is that most of the sequence parser don't seem to complain when ingesting a padded sequence out of the boundary, so getting more tolerant on the matching process seemed reasonable :)

The assemble function does not seem affected as you're not using Collection.match, so it would seem to be a good compromise, no?

@martinpengellyphillips
Copy link
Contributor

Yeah, making it invalid to cross the padding boundary (either through explicit constraint or documentation) might be most in line with original direction. As you say, will need to consider payoff between theory and practice here. The main thing is to make the behaviour consistent and clear, which currently it is not.

Whilst assemble itself does not call Collection.match directly it does perform some similar logic:

if len(str(abs(index))) == collection.padding:
    collection.indexes.add(index)

I guess that would also need to be considered for consistency with your changes.

@buddly27
Copy link
Author

buddly27 commented Apr 5, 2016

Agree! As you might have guessed this pull request is linked to Ftrack, as I modified the way we originally created sequence component to follow the ftrack way instead.

asset_version.create_component(
    path="/path/to/my_sequence.%d.exr [1-10]",
    data=data, location=location
)

As the Session object use clique.parse to create a clique.Collection and then match each member to name the sub-component, respecting the boundary rule becomes quite important. Maybe it would be interesting to be able to ingest directly a clique.Collection object to get it right before creating the asset.

This conversation does not belong here but I just wanted to give you the context!

Anyway, I'm sticking to the standard clique library for now.
Thanks for looking at it!

@martinpengellyphillips
Copy link
Contributor

Added bug #23

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants