Implement a recovery mechanism for failed http requests #66

gzhytar · 2017-11-12T15:15:34Z

krakenex: 2.0.0rc2

What are you trying to achieve?

During peak times large amount of http requests are failing without a specific response from Kraken server. These are intermittent problems with CloudFlare/Kraken and for some of returned http codes it should be safe to retry an operation. As each request has a unique nonce which is checked on a server, it should prevent execution of several identical requests if they get to a server despite a failed status_code.

Max number of retries should be configurable

EDITed by @veox:

Pull requests with the suggested change: ~~#65~~ #99 #100

veox · 2017-11-12T18:44:38Z

I find the idea reasonable; however, look at #65 (comment) for some remarks.

Most importantly, in the face of network congestion, a "retry" approach must be live-tested to determine that the particular HTTP errors don't result in duplicate orders. (Hence "need info".)

OT: Have you encountered any other major issues with 2.0.0rc2? I'm not sure how to interpret the relative silence concerning it.

Are there no other bugs? (Hard to believe.)

Does no one use it, since pip doesn't install it by default? (Quite likely.)

Did everyone give up on Kraken, since it lags so much? (Passionately empathising.)

gzhytar · 2017-11-12T18:56:54Z

Thanks @veox, your comments in the pull req make lots of sense. I'll upload a new revision once test it for a while.

I've started using krakenex few days ago through https://github.com/Endogen/Telegram-Kraken-Bot and was annoyed with constant networking problems as original version of the bot references krakenex 1.0 (i made a pull request today to use 2.0.0 RC).

I wish i looked at your repo and notice that you already have 2.0.0rc, but i didn't. Ended up rewriting krakenex to use Requests on my own, used it for couple days and only today noticed that you already did all the hard work ;)

My observations are that 2.0.0 rc is more stable, especially in multithreaded use-case (as telegram bot has several active threads sending requests). Adding retries also increased a success rate making orders and querying balance. I haven't noticed any issues so far.

And yes, lots of folks who are still trying to use Kraken through web UI are giving up now.

veox · 2017-11-12T18:57:52Z

For ref: a query that resulted in a 502 HTTP error can definitely "go through" to the trade execution engine. I've observed this personally on many occasions (see PSA on /r/krakenex).

The assumption that a nonce will prevent duplicate orders can therefore be tested if attempting a retry on a 502.

EDIT: To clarify: this is risky, in general don't do it! I've already tested this (see #66 (comment)), no need to step on the rake. :)

veox · 2017-11-12T19:46:38Z

I'll try to get to a "proper" v2.0.0 release next week, with pip package and all. Hopefully, that'll get more people to use it. Also, live-testing PR #65.

gzhytar · 2017-11-12T20:03:38Z

Cool. Telegram-bot author just accepted a pull request so hopefully there will be a few more folks on 2.0.0rc

On Sun, Nov 12, 2017, 20:46 Noel Maersk ***@***.***> wrote: I'll try to get to a "proper" v2.0.0 release next week, with pip package and all. Hopefully, that'll get more people to use it. Also, live-testing PR #65 <#65>. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#66 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AOQAs1PrGUUaE0f2zpQ79-oFT-8ID8MBks5s10sfgaJpZM4Qa6qp> .

-- -- Georgiy Zhytar Check out what I'm doing and subscribe to get your weekly portion of inspiration and knowledge to learn leading wisely http://LeadingWise.ly

philippose · 2017-11-12T20:37:52Z

Hello,
A Good Evening to you both.

Firstly, @veox, thank you very much for this very functional, but at the same time extremely simple to use interface to the Kraken API. I have been using this library for about a month now... Not trading yet, but still working on putting together a platform for automated trading (maybe someday, I will get that far!)

Just a comment on this particular proposed change...:
I was considering a similar idea because of the increased number of HTTP errors in the last few days.
However, like @veox, I do not think it is a very good idea to implement the retry logic so deep in the library. Most of the time when I received this error while adding orders, the order had gone through. In such situations, it would be catastrophic to have an automated retry, which would result in multiple orders.

I personally think that the implementation of the retry logic should be placed higher up at the application layer. something like "try, except" blocks could be used to process the error, and then probably send a request to check the current orders to first verify the status of the order before attempting a retry.

Regards,
Philippose Rajan

veox · 2017-11-12T21:08:21Z

@philippose Thanks for your thoughts!

I do tend to agree in general that it's "better" to scream loudly of a fault and do nothing when money is involved; however, the try/except approach doesn't allow for simple nonce reuse (the API-level one), as proposed here.

Perhaps, if re-using the PreparedRequest, instead of constructing a new one in the except block...

Otherwise, one could add userref parameters to orders when placing them, and if that fails:

wait for an (arbitrarily) "long" time;
look up the order by its userref;
if it's not found, retry sending the order.

This is error-prone, though, since the time-to-wait seems to depend on the load on Kraken's trade execution engine.

veox · 2017-11-12T21:14:52Z

Perhaps, if re-using the PreparedRequest, instead of constructing a new one in the except block...

That is, if the except caught ~~a 502~~ ~~a 504~~ something that's known to be "safe" to repeat, one should be able to access the precise request that was sent - as krakenex_API_object.response.request - and re-send that with krakenex_API_object.session.send(...).

Haven't tried it yet. Hopefully - next week.

EDIT: This will likely only work if there haven't been any queries in the interim with higher nonces, if not increasing the the nonce window in Kraken's settings... I should really go rest, though. :D

veox · 2017-11-14T19:36:35Z

TL;DR: try/except with session.send() in end-app works, but adds a lot of boilerplate (EDIT: and with a nasty repetition of in-lib code to catch subsequent exceptions). retries within lib approach as in #65 could be OK, if made opt-in.

Note: v2.0.0 has been released; I've used it for the experiment below. It has no changes compared to v2.0.0rc2.

I've tried the approach from my previous comment. It can be seen in this gist.

In short - it works. (Tried a few times.) A query that seems to fail with a 502, but does in fact go through, will be rejected by Kraken when repeated, with a response of:

{'error': ['EAPI:Invalid nonce']}

In other words, the nonce is indeed sufficient for Kraken to recognise it as a duplicate. (At least for now...)

This approach, however, adds a lot of boilerplate to the end application. The "wrapper" must be repeated for every query type, either manually or with a decorator.

This is unwieldy, and, more importantly, would likely be too complicated for a lot of people using krakenex. (From numerous issues reported this year, I'm guessing trading automation is the first stint with programming for many.)

@gzhytar's approach seems warranted - it doesn't add too much bloat, and is a solution for a common issue, especially in light of Kraken's current (abysmal) performance.

This should be, however, a deliberate opt-in setting, as mentioned in #65 (comment). Otherwise, it's only likely to further increase the load on Kraken's TEX.

pawapps · 2017-11-17T22:40:49Z

Appreciate all the work on this, I am using v1 and have implemented similar logic into the API class.

One thing I do not quite understand is Kraken's "nonce window" implementation. From experimentation I know resubmitting a nonce with a nonce window > 0 can make for a duplicate submit.

Here's my logic/flow for private queries:
Submit Query
if HTTP error: resubmit query with same nonce
if EService:Unavailable or EService:Busy error: resubmit query with same nonce
if EAPI:Invalid nonce error and AddOrder method: spend time checking for the wayward order
if EAPI:Invalid nonce error and NOT AddOrder method: resubmit query with new nonce

The Kraken has been very unhappy lately...

ps. (likely off-topic but...) how does v2 improve the reliability of the network connections? I don't see how it would make better the onslaught of 5XX server errors

veox · 2017-11-19T16:38:59Z

@pawapps The "nonce window" part would make a good separate issue. ;)

As to network connection reliability:

v1 didn't gracefully handle the case where the remote "dropped" the connection between queries. You'd have to check the connection state in advance, prior to sending the query; or reset the connection manually and repeat the failed query. The latter, when done inappropriately, could also result in two open connections, or at least a Connection object floating in memory (due to what looks like the OS-level broken socket left open)... All of this is too much networking where people are usually concerned with business logic (often literally).

In v2, requests handles this behind the scene. So you'll "only" get a requests exception as a direct result of performing a query; as compared to something that looks like a failed query, but didn't even make it to CloudFlare.

The latter is a rare case these days, which this issue is about (kind of...).

veox · 2017-12-10T14:29:51Z

It's been about a month of testing, and I haven't as-of-yet experienced a duplicate order when using the nonce-protection approach (see PR #65).

gzhytar mentioned this issue Nov 12, 2017

Connection: Implement a mechanism to retry failed HTTP requests for certain status_codes #65

Closed

veox added enhancement need info labels Nov 12, 2017

veox added this to the v2.x milestone Nov 12, 2017

veox mentioned this issue Nov 29, 2017

Connection error comes json.decoder.JSONDecodeError: Expecting value: #70

Closed

veox mentioned this issue Dec 10, 2017

Cannot send market order #73

Closed

veox added doc and removed need info labels Dec 10, 2017

veox mentioned this issue Dec 23, 2017

New Order Example #76

Closed

aghamir mentioned this issue Aug 28, 2018

502 error in response - submitted order #98

Closed

This was referenced Sep 12, 2018

WIP: Retry-on-failure using requests' session adapter #99

Open

WIP: Retry-on-failure using NIH approach #100

Open

veox mentioned this issue Feb 28, 2021

Handling ConnectionError, HTTPError thrown during a query #130

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a recovery mechanism for failed http requests #66

Implement a recovery mechanism for failed http requests #66

gzhytar commented Nov 12, 2017 •

edited by veox

Loading

veox commented Nov 12, 2017 •

edited

Loading

gzhytar commented Nov 12, 2017

veox commented Nov 12, 2017 •

edited

Loading

veox commented Nov 12, 2017

gzhytar commented Nov 12, 2017 via email

philippose commented Nov 12, 2017

veox commented Nov 12, 2017 •

edited

Loading

veox commented Nov 12, 2017 •

edited

Loading

veox commented Nov 14, 2017 •

edited

Loading

pawapps commented Nov 17, 2017 •

edited

Loading

veox commented Nov 19, 2017 •

edited

Loading

veox commented Dec 10, 2017

Implement a recovery mechanism for failed http requests #66

Implement a recovery mechanism for failed http requests #66

Comments

gzhytar commented Nov 12, 2017 • edited by veox Loading

What are you trying to achieve?

veox commented Nov 12, 2017 • edited Loading

gzhytar commented Nov 12, 2017

veox commented Nov 12, 2017 • edited Loading

veox commented Nov 12, 2017

gzhytar commented Nov 12, 2017 via email

philippose commented Nov 12, 2017

veox commented Nov 12, 2017 • edited Loading

veox commented Nov 12, 2017 • edited Loading

veox commented Nov 14, 2017 • edited Loading

pawapps commented Nov 17, 2017 • edited Loading

veox commented Nov 19, 2017 • edited Loading

veox commented Dec 10, 2017

gzhytar commented Nov 12, 2017 •

edited by veox

Loading

veox commented Nov 12, 2017 •

edited

Loading

veox commented Nov 12, 2017 •

edited

Loading

veox commented Nov 12, 2017 •

edited

Loading

veox commented Nov 12, 2017 •

edited

Loading

veox commented Nov 14, 2017 •

edited

Loading

pawapps commented Nov 17, 2017 •

edited

Loading

veox commented Nov 19, 2017 •

edited

Loading