-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposals for improvements #1055
Comments
By reading documentation one more time I found solutions for some of my problems:
|
I have a problem.
How do I parse/build that structure?
On build:
Pseudocode: def parse():
...
offset1 = u32.parse()
...
seek(offset1)
x = X.parse()
def build():
...
offset1 = tell()
seek(offset1 + u32.sizeof())
...
offset2 = tell()
seek(offset1)
u32.build(offset2)
seek(offset2)
X.build(x) I tried to do it like this, it parses as intended, but builds incorrectly: c.Struct(
'__X_offset_offset' / c.Tell,
'__X_offset' / c.Default(u32, 0),
'somedata' / somedata, # variable length
c.Seek(c.this.__X_offset),
c.Pointer(c.this.__X_offset_offset, c.Default(u32, c.this.__X_offset)),
'x' / X,
...
) Is it possible to implement that behaviour without making custom class? c.Struct(
'X_offset' / Offset(u32),
'somedata' / somedata,
OffsetApply('X_offset', c.Tell),
'x' / X,
) |
I know about |
That is an interesting point. I will give it some thought. |
What can you say about things I proposed? Which ones do you like and which ones don't? |
Here is a library I was working on: It can be used like this: I decided to switch to |
I am busy atm. I will look at the links and tell you how you can contribute later today. 👍 |
Here is what I would like to work on myself:
Here is what I would like you to work on:
The .flagbuildnone is a way for a subcon parsing class to tell its wrapper parsing class that it does not require a build value (the thing you pass into .build method). If you do .build(None) it promises to build just fine. Look at this... if the subcon has a buildfromnone set, it is given either a dict entry or a None, but otherwise its a straight error. Line 261, Converted looks a lot like Const parsing class. |
There is a couple of places where example show that >>> st.parse(b"\x03\x01\x02\x03")
-Container(count=3, items=[1, 2, 3])
+Container(count=3, items=ListContainer([1, 2, 3]))
>>> seq.parse(b"\x00\x80lalalaland\x00\x00\x00\x00\x00")
-[128, 'lalalaland', b'\x00\x00\x00\x00']
+ListContainer([128, 'lalalaland', b'\x00\x00\x00\x00'])
>>> d.parse(b"dsadhsaui")
-[100, 115, 97, 100, 104, 115, 97, 117, 105]
+ListContainer([100, 115, 97, 100, 104, 115, 97, 117, 105]) Also, in the example about Btw, can I reach you on some other platform? Maybe discord? That would be nice. |
No, I am using Issues only, for reasons I dont want to discuss in public. If you really have a need then my email is publicly available. But please dont use it unless its really private matters. Thank you for bringing those 2 issues (ListContainer and sizeof) to my attention. If you would submit a PR with corrections I would be more than happy to merge it. |
>>> VersionNumber.build(88)
Traceback (most recent call last):
...
raise ValidationError("object failed validation: %s" % (obj,), path=path)
construct.core.ValidationError: Error in path (building)
object failed validation: 88 and not just applies also to
>>> s = Struct('a' / Flag, 'b' / Flag)
>>> dict(s.parse(b'\1\1'))
{'_io': <_io.BytesIO object at 0x000001BC0EFB1670>, 'a': True, 'b': True}
>>> dict(s.compile().parse(b'\1\1'))
{'a': True, 'b': True}
class Container(dict):
def __new__(cls, *args, **kwargs):
self = super().__new__(cls, *args, **kwargs)
self.__dict__ = self # <- this is the most important part
return self
x = Container()
x.a = 42
assert x['a'] == 42
x['b'] = 24
assert x.b == 24
try:
x.t
except AttributeError:
pass # this should happen
try:
x['t']
except KeyError:
pass # this should happen
# you can do this:
x[5] = 6
# but this will not be accessible through attribute, obviously This reduces the overhead of attribute access virtually to zero, now they work just like with any other object attribute access. There is no need in running Also, it looks like |
>>> Container(x=EnumIntegerString.new(5, 'a'))
Container(x=uEnumIntegerString.new(5, 'a'))
# ^ notice 'u' prefix here I suggest removing
>>> RepeatUntil(list_[-1] == 0, Byte).parse(b"aioweqnjkscs\x00")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "d:\github\construct\construct\core.py", line 404, in parse
return self.parse_stream(io.BytesIO(data), **contextkw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "d:\github\construct\construct\core.py", line 416, in parse_stream
return self._parsereport(stream, context, "(parsing)")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "d:\github\construct\construct\core.py", line 428, in _parsereport
obj = self._parse(stream, context, path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "d:\github\construct\construct\core.py", line 2678, in _parse
if predicate(e, obj, context):
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "d:\github\construct\construct\expr.py", line 160, in __call__
lhs = self.lhs(obj) if callable(self.lhs) else self.lhs
^^^^^^^^^^^^^
File "d:\github\construct\construct\expr.py", line 217, in __call__
return self.__parent(*args)[self.__index]
^^^^^^^^^^^^^^^^^^^^
File "d:\github\construct\construct\expr.py", line 215, in __call__
return args[1]
~~~~^^^
IndexError: tuple index out of range not sure what happened, i guess
# at runtime:
construct.core.CheckError: Error in path (parsing) -> integrity1
check failed during parsing
# in docs:
ValidationError: check failed during parsing
# i fixed it to this fow now:
CheckError: check failed during parsing i guess everything that was raising you might find exception notes interesting in that regard: https://docs.python.org/3.12/library/exceptions.html?highlight=__notes__#BaseException.add_note
>>> d = FocusedSeq(1 or "num",
... Const(b"MZ"),
... "num" / Byte,
... Terminated,
... )
>>> d.parse(b"MZ\xff")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "d:\github\construct\construct\core.py", line 404, in parse
return self.parse_stream(io.BytesIO(data), **contextkw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "d:\github\construct\construct\core.py", line 416, in parse_stream
return self._parsereport(stream, context, "(parsing)")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "d:\github\construct\construct\core.py", line 428, in _parsereport
obj = self._parse(stream, context, path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "d:\github\construct\construct\core.py", line 3238, in _parse
return finalret
^^^^^^^^
UnboundLocalError: cannot access local variable 'finalret' where it is not associated with a value
>>> d.build(255)
Traceback (most recent call last):
File "d:\github\construct\construct\core.py", line 1165, in _build
data = struct.pack(self.fmtstr, obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
struct.error: required argument is not an integer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "d:\github\construct\construct\core.py", line 452, in build
self.build_stream(obj, stream, **contextkw)
File "d:\github\construct\construct\core.py", line 464, in build_stream
self._build(obj, stream, context, "(building)")
File "d:\github\construct\construct\core.py", line 3246, in _build
buildret = sc._build(obj if sc.name == parsebuildfrom else None, stream, context, path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "d:\github\construct\construct\core.py", line 2774, in _build
return self.subcon._build(obj, stream, context, path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "d:\github\construct\construct\core.py", line 1167, in _build
raise FormatFieldError("struct %r error during building, given value %r" % (self.fmtstr, obj), path=path)
construct.core.FormatFieldError: Error in path (building) -> num
struct '>B' error during building, given value None
# example:
>>> d = BitsInteger(16, swapped=this.swapped)
>>> d.build(1, swapped=True)
b'\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
# ^^
# my result:
>>> d = BitsInteger(16, swapped=this.swapped)
>>> d.build(1, swapped=True)
b'\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00'
# ^^
>>> d.build(1, swapped=False)
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01'
# ^^ |
Holy cow, you certainly dug up some dirt there. It will take me some time to get back to you on this, sorry. 😞 Feel free to post even more if you want. Just... (whispers) dont overdo it... |
I am going through your changes and adding lots of mine, so please do NOT add more commits, not unless you want to merge them yourself later. It is going to take me 1-2 more days to sift through everything you committed. |
Do you like the idea of making P.S: if you merge two open PRs, i can continue working without worrying about possible merge conflicts. |
No, namespaces are a no go. I will get on top of your PRs later today. |
I found this in 2.10 changelog:
|
so example is just wrong, ok, it should be updated |
I was reading your changes in the documentation and i have several little notes:
>>> from construct import *
>>> c = Computed(42)
>>> s = Struct('a' / c, 'b' / c) + (Struct('c' / c, 'd' / c) + Struct('e' / c, 'f' / c))
>>> s.parse(b'')
Container(a=42, b=42, c=42, d=42, e=42, f=42) # <- no nesting
Everything else looks fine. |
3rd point, YES you right of course, its not a valid syntax. I will correct it. |
@denballakh the I would love to know if it improves your use case or breaks it... |
slots in new versions are no longer faster than regular attrs, so there would be no perfomance increase |
I dont have a break down on the speed effect of each and every optimisation on this pull request. But on my workload it improved about 3x (running on pypy), in real world application... There are user visible changes though.... |
Oh right. I didn't consider pypy at all, i was talking about cpython3.12. |
@denballakh Do you have a source for that statement. That's news to me. |
After thinking about it for a while, i came up with this optimisation.: Instead of going __ slots __ (which has problems with names like 1, a-b etc...) it goes dict, and only converts to Container when necessary... This achieves a significant speed gain but is much less intrusive than using __ slots __. |
Hi! Your project is awesome! I made very similar thing myself for my project, and then found
construct
.I have several proposals and questions (they are at the end).
Here they are:
New features:
add support for recursive constructs. Right now I am doing this:But this doesn't feel like a good way. Also, such constructs are not compile'ableconstruct
at runtime (or do, but doesn't require recompiling every time)u8
/i32
/f64
foruint8
/int32
/double
. But there is a problem: now there is no way to specify endiannes, so making new namespace for them might be good:construct.le.u32
,construct.be.f64
,from construct.le import *
.parse
result into ordinary pythondict
/list
's. Right now they are not json-serializable because of._io
fieldMicro-optimizations:
__slots__
to reduce memory usage and (possibly) speed up some hot attribute lookups(stream, context, path)
are passed everywhere, but are rarely accessed. Maybe we can store some of them in some kind of global storage (probably thread-local)?Python versions:
Tooling:
black
'ify codeDocumentation:
s = X or Y
is confusing)Other:
Questions:
.flagbuildnone
? I can't figure out from source code what it does.what is the common way to make construct that does nothing to stream and returnsNone
? I found out thatComputed(None)
does exactly that, but something likeconstruct.Nothing
would be nice to have.The text was updated successfully, but these errors were encountered: