Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix byte mode detection on BgzfWriter #4724

Merged
merged 2 commits into from
May 12, 2024

Conversation

DavidCain
Copy link
Contributor

In commit c4a47ff, there was an
attempt to detect handles which are not in binary mode. However, there's
a core problem here in that most handles open in write mode cannot be
read from. For example:

>>> handle = open('/tmp/demo.txt', 'wb')
>>> handle.read(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
io.UnsupportedOperation: read

This means that one cannot pass a plain writeable file object to
BgzfWriter:

>>> Bio.__version__
'1.83'
>>> Bio.bgzf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'Bio' has no attribute 'bgzf'
>>> from Bio import bgzf
>>> with open('/tmp/demo.txt', 'wb') as handle:
...     bgzf.BgzfWriter(fileobj=handle)
...
    if fileobj.read(0) != b"":
       ^^^^^^^^^^^^^^^
io.UnsupportedOperation: read

If eager detection of the handle's mode is desired, one can check
handle.mode as was done before -- I opt here to just hardcode handling
of io.StringIO. The simplest option in my opinion is to just rely on
ducktyping -- the _write_block() method will try to write the data --
either the handle accepts bytes and all is well, or it fails.

In commit c4a47ff, there was an
attempt to detect handles which are not in binary mode. However, there's
a core problem here in that most handles open in write mode *cannot* be
read from. For example:

```python
>>> handle = open('/tmp/demo.txt', 'wb')
>>> handle.read(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
io.UnsupportedOperation: read
```

This means that one cannot pass a plain writeable file object to
`BgzfWriter`:

```python
>>> Bio.__version__
'1.83'
>>> Bio.bgzf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'Bio' has no attribute 'bgzf'
>>> from Bio import bgzf
>>> with open('/tmp/demo.txt', 'wb') as handle:
...     bgzf.BgzfWriter(fileobj=handle)
...
    if fileobj.read(0) != b"":
       ^^^^^^^^^^^^^^^
io.UnsupportedOperation: read
```

If eager detection of the handle's mode is desired, one can check
`handle.mode` as was done before -- I opt here to just hardcode handling
of `io.StringIO`. The simplest option in my opinion is to just rely on
ducktyping -- the `_write_block()` method will try to write the data --
either the handle accepts `bytes` and all is well, or it fails.
@peterjc
Copy link
Member

peterjc commented May 10, 2024

Oh my, we missed that on #4270 - and therefore we're missing something on the test suite too.

@peterjc
Copy link
Member

peterjc commented May 10, 2024

Do you agree to dual licence this and any previous contributions under both the Biopython License Agreement AND the BSD 3-Clause License.

Also do you want to be named in the NEWS file etc?

@DavidCain
Copy link
Contributor Author

Do you agree to dual licence this and any previous contributions under both the Biopython License Agreement AND the BSD 3-Clause License.

Also do you want to be named in the NEWS file etc?

Yes, I agree to dual-license this work, and sure -- feel free to mention me in NEWS and such (I should already be in CONTRIB.rst!). Thanks, @peterjc

Copy link
Member

@peterjc peterjc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge it in the next few days (leaving you a chance to update the NEWS file to add your name if you want). Thanks!

@DavidCain
Copy link
Contributor Author

Thanks, @peterjc -- just added!

NEWS.rst Show resolved Hide resolved
@peterjc peterjc merged commit 8088f20 into biopython:master May 12, 2024
35 checks passed
@DavidCain DavidCain deleted the dcain-bgzf-writer branch May 13, 2024 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants