Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upload_datafile: handling of the content (mime) type #142

Open
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

landreev
Copy link

@landreev landreev commented Jan 21, 2022

Any change needs to be discussed before proceeding. Failure to do so may result in the rejection of the pull request.

All Submissions

Describe your environment

  • OS: MacOS X 11.4
  • pyDataverse: 0.3.1
  • Python: 3.8.8
  • Dataverse: 5.9

Follow best practices

  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?
  • Have you followed the guidelines in our Contribution Guide?
  • Have you read the Code of Conduct?
  • Do your changes in a seperate branch. Branches MUST have descriptive names.
  • Have you merged the latest changes from upstream to your branch?

Describe the PR

There is currently no way to pass the content (mime) type to upload_datafile() (see #118). Also, when the multi-part POST form is created inside the method, NO content type is specified for the upload. This apparently fools Dataverse into defaulting to "text/plain", without attempting to use its normal type detection methods. In other words, in its current form, all files uploaded via pyDataverse end up with the content type "text/plain". Even when they are of types normally recognized by Dataverse (popular image types, etc). This defaulting behavior can and should be addressed on the Dataverse side. But it should be a good idea to fix it on the pyDataverse side as well. So this PR does 2 things:

  1. Provides a way to supply the mime type explicitly; and
  2. Makes it default to the standard application/octet-stream - a polite way to say "type unknown" - when creating a multi-part POST entry, like curl does; which then prompts Dataverse to at least attempt to identify the file more accurately. This is achieved by switching to the long notation of passing the file to the requests.post method: from {"file": open(filename, "rb")} to {"file": (filename, open(filename, "rb"), content_type)}.

On the Dataverse side this is tracked in IQSS/dataverse#8344

  • What kind of change does this PR introduce?
    • bug fix/improvement
  • Why is this change required? What problem does it solve?
    • see the description above and the discussion in the linked issues
  • Screenshots (if appropriate)
  • Put Closes #ISSUE_NUMBER to the end of this pull request

Testing

  • Have you used tox and/or pytest for testing the changes?
  • Did the local testing ran successfully?
  • Did the Continous Integration testing (Travis-CI) ran successfully?

Commits

  • Have descriptive commit messages with a short title (first line).
  • Use the commit message template
  • Put Closes #ISSUE_NUMBER in your commit messages to auto-close the issue that it fixes (if such).

Others

  • Is there anything you need from someone else?

Documentation contribution

  • Have you followed NumPy Docstring standard?

Code contribution

  • Have you used pre-commit?
  • Have you formatted your code with black prior to submission (e. g. via pre-commit)?
  • Have you written new tests for your changes?
  • Have you ran mypy on your changes successfully?
  • Have you documented your update (Docstrings and/or Docs)?
  • Do your changes require additional changes to the documentation?

@codecov
Copy link

codecov bot commented Feb 3, 2022

Codecov Report

Merging #142 (628dbb4) into develop (cc06022) will not change coverage.
The diff coverage is 50.00%.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop     #142   +/-   ##
========================================
  Coverage    50.91%   50.91%           
========================================
  Files            5        5           
  Lines         1316     1316           
========================================
  Hits           670      670           
  Misses         646      646           
Impacted Files Coverage Δ
src/pyDataverse/api.py 21.48% <50.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cc06022...628dbb4. Read the comment docs.

@skasberger
Copy link
Member

Update: I left AUSSDA, so my funding for pyDataverse development has stopped.

I want to get some basic funding to implement the most urgent updates (PRs, Bug fixes, maintenance work). If you can support this, please reach out to me. (www.stefankasberger.at). If you have feature requests, the same.

Another option would be, that someone else helps with the development and / or maintenance. For this, also get in touch with me (or comment here).

@pdurbin pdurbin linked an issue Feb 28, 2024 that may be closed by this pull request
@JR-1991
Copy link
Member

JR-1991 commented Feb 28, 2024

@landreev thanks for the reminder in #142 and this PR! I think the addition of this PR makes sense and enhances the utility of the library. We have been able to resolve a similar issue encountered when replacing files with #174, wherein we have switched to another library that does not send text/plain by default.

Can you sync this PR to the current main branch in order to review it?

@JR-1991 JR-1991 changed the base branch from develop to master March 13, 2024 15:28
@pdurbin
Copy link
Member

pdurbin commented Mar 27, 2024

@landreev hey, @JR-1991 and I talked about this PR a couple weeks ago (recording) and wanted to let you know that merge conflicts are due to pyDataverse switching from requests to httpx in #174

We suspect #118 may be fixed already (I just left a comment about this) but we think the ability to send an arbitrary MIME type might be a nice feature to have and we'd welcome a pull request if you have the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

Add file-type to file upload
6 participants