Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming requests for Net::HTTP #604

Merged
merged 4 commits into from
Aug 30, 2017
Merged

Conversation

nightscape
Copy link
Contributor

This PR is a rebased version of #461 and #522 including an additional test and fix for empty initial chunks mentioned by @grosser in #461 (comment).
Tests are all green on my machine, so it would be great if someone (maybe @mislav or @iMacTia) could have a look at this 😃
The other two PRs can be closed in favor of this one.

@nightscape nightscape changed the title Streaming implementation for Net::HTTP Streaming requests for Net::HTTP Aug 30, 2016
@@ -26,6 +26,11 @@ def call(env)
# Queue requests for parallel execution.
if env[:parallel_manager]
env[:parallel_manager].add(request, http_method, request_config(env)) do |resp|
if (req = env[:request]).stream_response?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract into a method to avoid duplication ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm running in circles trying to find a nice way to extract a method here.
Do you have something in mind?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope ... could put it on the request or make it static ...

On Tue, Aug 30, 2016 at 3:11 PM, Martin Mauch notifications@github.com
wrote:

In lib/faraday/adapter/em_synchrony.rb
#604 (comment):

@@ -26,6 +26,11 @@ def call(env)
# Queue requests for parallel execution.
if env[:parallel_manager]
env[:parallel_manager].add(request, http_method, request_config(env)) do |resp|

  •        if (req = env[:request]).stream_response?
    

I'm running in circles trying to find a nice way to extract a method here.
Do you have something in mind?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/lostisland/faraday/pull/604/files/60cdd853545a06a19fd9cd771bbace1196cea141#r76889487,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAAsZz0RQFxpvFbCQALEXeVix6izqDMnks5qlKp7gaJpZM4JxAhc
.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure of it as well, but here are my 2 cents.
I personally like having a perform_request method on most of the adapters. I would suggest having a perform_stream_request method to manage stream logics and having the call method calling the proper "perform" based on a check with stream_response?.

The common logic will therefore be something like (extremely simplified):

def call(env)
  if req.stream_response?
    perform_stream_request
  else
    perform_request
  end
end

At this point, perform_stream_request can be implemented in the common Faraday::Adapter triggering the warning message and override this method only in the adapters where
you want to provide a proper implementation.
I really hope this is clear, but please do let me know if that's not the case!

Good work btw 👍!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still can't find a nice way to make this work.
If the call method you sketched above was located in Faraday::Adapter then sub-classes would need to have the same signature for perform_request which doesn't seem to be the case.
The same applies to perform_stream_request which - if I understood correctly - should delegate to perform_request unless overridden in a sub-class.
The only way I can currently think of to make this work is to copy the above code snippet into the Adapter sub-classes and have individual implementations of perform_streaming_request which would have a similar signature as perform_request.
But maybe I'm just overlooking something.

Would you be OK with merging even in the current unDRY state of the code?
Maybe someone comes up with a nicer solution later, I'm currently not seeing a proper way out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that's ok if we make tests work so we don't break everything later refactoring the code 😁

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Travis tests are working now 👍

@nightscape
Copy link
Contributor Author

Hmm, I'm rather clueless why Travis is failing. Tests are all green, but there are some cryptic warning: previous definition of xxxxx was here syntax warnings here https://travis-ci.org/lostisland/faraday/jobs/156358909#L216 and a lot of

ERROR WEBrick::HTTPStatus::ProxyAuthenticationRequired

at the end of the build.

@nightscape
Copy link
Contributor Author

Giving this a little more thought, I think a much nicer API would be a Faraday::Response#chunks method which returns an Enumerator. That way, you could achieve the same as with a callback using each on the Enumerator, but you could also map, select, ... on it and pass the Enumerator around like any object.
Trying this now...

@iMacTia
Copy link
Member

iMacTia commented Aug 31, 2016

@nightscape first of all thanks for this important contribution!
I've never had the need to use Faraday to read from a streaming so I'm not sure I will be able to contribute on this, but surely will come useful for future users :)

I've already commented on the repetition of the code with the warning with my suggestion to limit that issue and provide a common interface on all adapters (checking stream_response? on the call method and then delegating to the proper method).

Regarding the failing tests, I'll need a computer to see what's wrong and I won't have one in the next couple of weeks. @sbleon I suppose this is not the same issue you fixed last day?

Finally, the Enumerator seems also a good idea, but there's still something I don't really get...
Since we're talking about a stream, this means that the connection will be kept open and new chunks will be added to the response over time, so the response could potentially never end.
We should design this so that people can easily use it in most situations.

Just to make this easier to understand, can you show a couple of example on how you would use this streaming functionality?

@@ -1,4 +1,5 @@
require 'sinatra/base'
load 'test/helper.rb'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing the ruby syntax warnings about re-declarations are due to this line, as some of the things it's complaining about are defined in test/helper.rb. I'm pretty sure this test helper script shouldn't try to include the setup for the test suite itself.

I'm not sure why you have this load here at all. Is it because you want to use Faraday::TestCase.big_string? If so, you can just copy that method over to Faraday::LiveServer. I think it's too small to worry about the duplication.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, if it is complaining is probably because that file gets loaded twice. Maybe changing load with require will fix the issue at requiring the same file twice only loads it once

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might do it, but I still think you should bring in only what you need.
Maybe big_string could be moved into a module that can be used by both
test/helper and live_server?

On Wed, Aug 31, 2016 at 1:11 PM, Mattia notifications@github.com wrote:

In test/live_server.rb
#604 (comment):

@@ -1,4 +1,5 @@
require 'sinatra/base'
+load 'test/helper.rb'

Good point, if it is complaining is probably because that file gets loaded
twice. Maybe changing load with require will fix the issue at requiring
the same file twice only loads it once


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/lostisland/faraday/pull/604/files/f0ea17749ae5bb5777818efc0fa7afe711ecb7ae#r77029751,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAMIN115Z8_xL9DT5wfXQNOeZ6tJRTzgks5qlbW5gaJpZM4JxAhc
.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me 😄
Any DRY solution is welcome!

@nightscape nightscape force-pushed the streaming branch 3 times, most recently from 3dda408 to d216179 Compare September 5, 2016 21:24
@nightscape
Copy link
Contributor Author

I've pondered the Enumerator idea again, and I think it has one fundamental flaw: Control over the reading the response is given to the user of the Enumerator. Endless responses would not be so much of an issue (that's one of the typical use cases of Enumerators), but if a client just called first on the Enumerator and kept the Enumerator instance around it might be that the response and the connection remain open indefinitely (or until some timeout cuts the connection).
I might be wrong on this, but maybe it's better to continue with the original callback style which should not have this issue.

I've incorporated @sbleon's proposal of extracting the big_string method into a shared module, so except for the duplication of warning message creation (for which I can't find a good solution right now) this would be good to merge from my side.

What do you think, @iMacTia and @sbleon?

@iMacTia
Copy link
Member

iMacTia commented Sep 6, 2016

How is the documentation? Do we have a good explanation on how to use te callback to manage the streaming request?

Apologies again for not checking myself but doing everything with the mobile is kind of painful 😥

@nightscape
Copy link
Contributor Author

I just added a small example to the README.
Is there another place where documentation should be added?

@iMacTia
Copy link
Member

iMacTia commented Sep 7, 2016

Not as far as I know, README should be fine 😁
I'm ok with merging at this point. if @sbleon agrees as well, he can proceed 👍

@akshaylive
Copy link

Good PR! Looking forward to this one! 👍

@iMacTia
Copy link
Member

iMacTia commented Sep 16, 2016

@nightscape wait a sec, are warning still there even though you followed @sbleon advice? Did you remove the load test/helper.rb after extracting the big_string method?

@nightscape
Copy link
Contributor Author

Hi @iMacTia, the warnings due to big_string are solved.
What I was referring to above is that the code that generates warnings for adapters that don't support streaming is still duplicated because I haven't found a nice way to DRY it.
So from my side this is good to merge and maybe later someone comes up with a nice idea of how to DRY or improve the code.

@iMacTia
Copy link
Member

iMacTia commented Sep 16, 2016

I see, sorry @nightscape, reading again after some days I confused the build warnings with those ones 😅
So again, all good for me if @sbleon agreed as well. If he's still busy and can't merge, don't worry, I will merge early next as soon as I have a proper computer to check it 😉

if chunk.size > 0 || size > 0
yielded = true
size += chunk.bytesize
env[:request].on_data.call(chunk, size)
Copy link

@nurse nurse Sep 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In README.md it says the call's 2nd argument is chunk_size, but size here is total size.
Which do you intend?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch :) Which one would be preferred? Bytes or Unicode characters? I don't care either way.

Copy link
Member

@iMacTia iMacTia Sep 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what the documentation states is probably the best option in this case:

Set a callback which will receive tuples of chunk Strings
and the sum of characters received so far

as the chunk_size can be easily obtained calling chunk.bytesize, making it quite useless as parameter.
The only "wrong" thing here, I think, is the documentation name for the block parameter.
Maybe size rather than chunk_size will remove the confusion :)

Copy link

@nurse nurse Sep 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nightscape It must be bytes because HTTP response body may be other than Unicode or simple binary data.

@iMacTia yeah, I agree. just fixing README sounds reasonable.

size = 0
yielded = false
http_response = perform_request_with_wrapped_block(http, env) do |chunk|
if chunk.size > 0 || size > 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chunk.size should be chunk.bytesize.


# Check that the total size of the chunks (via the last size returned)
# is the same size as the expected_response
assert_equal sizes.last, expected_response.size
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected_response.size should be expected_response.bytesize.

@iMacTia
Copy link
Member

iMacTia commented Sep 27, 2016

@nightscape: @nurse highlighted a couple of things that are not completely clear, do you think you can have a look at them before we proceed? Thanks

@nightscape nightscape force-pushed the streaming branch 2 times, most recently from 9da67fd to 6b984fd Compare October 4, 2016 17:37
@nightscape
Copy link
Contributor Author

@nurse: I added two fixup commits. If those changes are what you meant and Travis gives us green light, I'll squash them into their corresponding commits.
I also rebased upon the latest master version.

@iMacTia iMacTia self-assigned this Nov 8, 2016
@iMacTia iMacTia added this to the 1.0 milestone Nov 8, 2016
@iMacTia iMacTia mentioned this pull request Nov 8, 2016
@nightscape
Copy link
Contributor Author

Hi @iMacTia,

that sounds reasonable. We'll be using our own fork or maybe just wait for 1.0 👍

@iMacTia
Copy link
Member

iMacTia commented Aug 18, 2017

@nightscape Coming back to this as part of the v1.0 works.
Have you been using this for the past months? Is there any change you would like to add before this gets merged?
Would you be able to rebase against v1.0 branch? (please note I've changed the PR base branch already

@nightscape
Copy link
Contributor Author

Hi @iMacTia,

because of shifting priorities and a partial move to Scala we haven't been using the code in the past months.
I can try to rebase the PR against 1.0, let's see if I still find my way around 😉

james-lawrence and others added 3 commits August 29, 2017 17:10
Under certain conditions, it seems that Net::HTTP first returns
an empty chunk before reading the actual data.
This commit makes sure that such chunks are filtered out for the
case where later chunks contain data.
If the response is completely empty, an empty chunk is returned
nonetheless so that clients can safely assume that their callback
will be called at least once.
@nightscape
Copy link
Contributor Author

Hi @iMacTia,
I just rebased the code (there was only one minor conflict in option handling).
Travis still says everything's good 😃

@iMacTia
Copy link
Member

iMacTia commented Aug 30, 2017

Hi @nightscape, thank you very much!
I really appreciate you finding some time to come back to this!
It was an highly requested feature so I'm happy to make this the first important merge of v1.0 👍
We'll probably need to do some refinement, but this looks like a great start!
Thanks

@iMacTia iMacTia merged commit ff94676 into lostisland:1.0 Aug 30, 2017
@nightscape nightscape deleted the streaming branch August 30, 2017 08:36
@nightscape
Copy link
Contributor Author

Thank you for looking into it and merging!

@grosser
Copy link
Contributor

grosser commented Aug 30, 2017

amazing, it finally happened :D

@iMacTia
Copy link
Member

iMacTia commented Aug 30, 2017

I never said this wouldn't happen 😉
I'm just sorry this took so long.
@grosser are you still interested in using this? It would be nice to have some real-world testing of these changes. Maybe not in production but on a side project 😄

@grosser
Copy link
Contributor

grosser commented Aug 30, 2017

no idea where/why I needed that :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants