PHP Guzzle 6 – accessing initial request data from a Guzzle Response object
Guzzle is one of the most popular PHP libraries for creating HTTP requests, that is actively maintained currently. It’s good for simple, one by one request firing, but once you need more, then I personally find myself scrambling through the documentation and looking at the source code. Their documentation site could use some improvements by showing more examples.
One of common tasks I often do with web crawlers is fire requests concurrently. This will go fine if you are only interested in the responses. If you are interested in finding out what request the response was coming from, then it is problematic because we only see Response object in fullfilled
callback. There is index
in the callback which we could use to map to original request, but the documentation is not clear as to where it comes from. Even if the index is helpful, I find mapping back to an array to get original request a lame work-around.
To solve this problem we need to use generators and promises. In this solution, and in general for Guzzle, generators are used as a fancy way to get a new request when the Pool needs to be full to the maximum concurrency. In my project I have moved the request creating in a separate class method and I use array_pop
in a while
loop to keep getting requests from my own request manager class.
If you have ever used Promises in JavaScript then Promises in PHP should be familiar to you. In this instance, in basic terms, we are creating an asynchronous, promise based request that upon resolving to a response will attach our _requestManagerData
from the request to the response (this happens within then
callback). We can attach data to the response because we are still within the request generator and are able to access it. What we are yielding through this generator is a promise based request, not just a request. This promise will keep going down the code until until it resolves or fails, respectively in fullfilled
or rejected
callback, and we will have our _requestManagerData
attached there!
I dug through a lot of Google results, GitHub issues and StackOverflow answers. And only this StackOverflow question was closest to my problem, and even then the answers were all rubbish apart from one by Henrik.
Hopefully someones finds this useful!
Related Posts
December 14, 2016
Taking a look at SteamAnalyst’s anti-scraping methods
SteamAnalyst is a site that shows…