Another Data Mining Blog: Buy Low Sell High - leaderboard and rule clarification

Sunday, 15 September 2019

Buy Low Sell High - leaderboard and rule clarification

2019
<<Previous Next>>

You can now register your team and submit your trades files to see how you do on the leader board:

Just to clarify a few things:

You should only use the values of the current and previous predictions to make a decision
If you use future prediction values then you will probably get a good result but this logic is unimplementable
Any strategies found to be using future prediction values will be disqualified, so there is no point in trying
Do not use anything else in your algorithm such as pair name, price or absolute time
If you want to use previous prediction values, then use the relative time differences to determine them
The new file we provide will be consistent in having predictions generated every 5 minutes intervals, but the absolute values could be anything
After the deadline, you will be asked to nominate your 3 long and 3 short strategies you want to be evaluated
All teams beating the Benchmark solution on the private leader board for their nominated strategies will then qualify for stage 2
We will then invite those teams to run their code over several new files. They must only use the strategies nominated and the strategies must be locked in with no further parameter tuning allowed
The new files will have different pair names and the start time for the field minutesSinceStart will not necessarily be the same start time as the file already provided
The winners will be the team that gives the best return on the new data providing they still beat the Benchmark and we are confident no future prediction values have been used.
There will be a winners for Short and a winners for Long
If we suspect future prediction are being used then we will say how we came to this conclusion and the team will have a right of reply to prove otherwise
There will be a benchmark prize for the first leading team on the private leaderboard as at 12 pm on Thu Oct 17th that wishes to reveal their method. In order to receive the prize the team must write a blog post describing their method so it can be reproduced by others. It must not use future information. It is not compulsory to reveal your method, so we will proceed down the ranking and award the prize to the first team that wishes to do so.

Just to clarify what we mean by 'future information'. The data set does contain records that are 'in the future' to the times we have asked you to make decisions for. It is OK to use this data to come up with a set of coefficients for a model.
What it is NOT OK to do is use the raw prediction values at time 'x' as inputs to a model making a decision for a time prior to 'x'. This is an unimplementable solution.

If there is anything else that needs clarifying, ask below and we will add to the list

Good Luck

16 comments:

Khurram Siddiqui15 September 2019 at 05:53
Hi Phil , as i understand that we have to find enter/exit time, directions and pair name. The pair name which our model will find will be from given 14 pairs e.g. '0x_bitcoin','bitcoin_usdollar','bitcoincash_usdollar' etc.

If above mine understanding is correct then what point 10 is telling that the new files will have different pairs?Different pairs you mean totally new brand pair name which are not like given 14 pairs?
ReplyDelete
Replies
Sali Mali15 September 2019 at 05:59
Please read the https://anotherdataminingblog.blogspot.com/2019/08/buy-low-sell-high.html
ReplyDelete
Replies
Sali Mali15 September 2019 at 14:35
You don't have to find the pair name - you have to provide entry and exit times for whatever pair name is given, based only on the predictions values for that pair name.
ReplyDelete
Replies
m^15 September 2019 at 17:02
At what point will teams be disqualified for using future values in predictions? The leaderboard now looks like it is already happening :)
ReplyDelete
Replies
Sam16 September 2019 at 03:17
Question: Point 9 says "We will then start with the leading team and invite them to run their code..." but then point 11 says "if the returns beat the Benchmark and we are confident no future information has been used then that team will be a winner".

Does this suggest that if the private rankings look like this:

Team A: 1200%
Team B: 1000%
Team C: 900%
Team D: 800%

Then in order, A,B,C,D they will run their code on round 2. If the returns on the new data look like:

A: 60%
B: 70%
C: 55%
D: 130%

A would still win, just because they were picked first? Does this not reward the team that overfits their solution on the private data during round 1 so that they get first pick in round 2? I would think that team D should be the overall winner for having the highest return in a more real scenario?
ReplyDelete
Replies
Sam16 September 2019 at 03:20
Further, given that there is a benchmark for both long and shorts, is the 'beating the benchmark' determined by adding the returns on both of your strategies? what if you only have a short strategy? The same question goes for the leaderboard ranking overall, what if one team has a 600% long strategy, and another team has a 400% long + 400% short strategy, with neither beating the benchmark?
ReplyDelete
Replies
Sali Mali16 September 2019 at 03:38
Long and Shorts will be treated separate - so there will be a winners for Longs and winners for Shorts.
ReplyDelete
Replies
Sali Mali16 September 2019 at 17:49
the 'clarification' above has now been edited. Hope it is now clear?
ReplyDelete
Replies
data solution provider23 December 2019 at 22:58

nice post,thanks for sharing this information.we are providing Data mining services for quality data
ReplyDelete
Replies
Hir Infotech30 December 2019 at 04:22
This comment has been removed by the author.
ReplyDelete
Replies
360DigiTMGNoida16 March 2021 at 05:13
Amazing blog.Thanks for sharing such excellent information with us. keep sharing...
ai course in noida
ReplyDelete
Replies

Add comment