Data consistency issue: changing history

General discussion about the Tickstory Lite software package.
Post Reply
tickster
Posts: 5
Joined: Fri Aug 30, 2013 7:40 am

Data consistency issue: changing history

Post by tickster »

Hi

I have been using tickstory for a few months now and like it a lot, that is why I was quite surprised when discovering this.

Let me describe the issue using an example with a limited amount of data for easier checking/comparison: EURUSD for the late hours of 1-Jan-2012.
I have downloaded this data the first time around August 2013 but then re-downlaoded it 2 days ago (12-Feb-2013) on a separate instance of tickstory (separate PC that did not have any EURUSD data so far).

The 1-Jan-2012 tick-data has changed at some point between Aug-2013 and now which in my view goes against one of the main advantages of tickstory: consistent and reproducible data.

This effect can be confirmed by drilling down the data hierarchy of the tickstory database. For example, the file 21h_ticks.bi5 was empty and now it contains 12k of data.
It is not only the case that the hour 21 of EURUSD 1-Jan-2012 got added to the server only sometime after September 2013 but I also noticed that the values of the 22nd hour (still just an edge case example) changed by as much as 3-4 pips.
While I presume that the new data is of better quality I am equally concerned that old backtests become worthless since the revised data produces different results.

I understand that data correction are sometimes unavoidable, but would like to ask some specific questions and make a proposal:
a) can someone from tickstory confirm that some historic data has been changed in recent months?
b) how often does it happen, that data that is more than a month old gets changed on the server? If so, is there any way to make the users aware of this?
b) While I am aware that a new download of existing data does NOT overwrite the local TS database I would like to propose a new feature that allows a user to compare his local data with the latest data on the server periodically - ideally even by visualising differences. If there are differences, the user should have the choice if he wants to download those and overwrite his local DB or if he rather keeps the old values (already in his local db) for consistency reasons (even though the new ones on server might be more accurate).

I think that such a feature would make an already good tool even better.

Regards,
tickster
tickstory
Posts: 5167
Joined: Sun Jan 06, 2013 12:27 am

Re: Data consistency issue: changing history

Post by tickstory »

Hi Tickster,

Note that all the data is provided as-is from Dukascopy - we do not alter it any way nor have the scope to alter the original data files (*.bi5). Another user recently reported the issue of different data (see http://www.tickstory.com/forum/viewtopic.php?f=4&t=385) and we are unsure of why it was revised, however we do know there was an update to the Dukascopy data format circa Jan 2012.

In later versions of Tickstory, we plan to have a data comparison feature which may help you resolve this issue.

Regards.
tickster
Posts: 5
Joined: Fri Aug 30, 2013 7:40 am

Re: Data consistency issue: changing history

Post by tickster »

Hi tickstory

I was not thinking that tickstory would alter any data, but was of the impression that the data is pre-downloaded from Dukas to tickstory servers from where TS-Light then downloads them.
Is that not the case? Is the TS-Light tool routing all requests directly to the Dukas-API?

In any case, I doubt that the changed data has anything to do with the mentioned format change, but rather relates to a data-cleanup that Dukas must have performed recently (as I said, my downloads were still different in Sep 2013, close to 2 years after the format change and only changed between Sep. and now).
You would not by chance have any connections at Dukas to check and possibly confirm that they have indeed done a data cleanup recently? ;)

Well, anyhow thanks very much for planning such a data comparison feature - I think it will indeed help to reduce unexplained back testing discrepancies - if changes can't be avoided by getting things right initially this feature can at least provide awareness and transparency. Normally I would not expect a need to delete and re-download data that i have already extracted, but in such a case it might very well make send - provided I know about it.

Best regards,
tickstory
Posts: 5167
Joined: Sun Jan 06, 2013 12:27 am

Re: Data consistency issue: changing history

Post by tickstory »

Hi Tickster,

Tickstory intentionally doesn't try re-check the data if it already exists since this would obviously increase download times and bandwidth. Unfortunately there is no option to simply perform a "checksum"check and retrieve the data only if it has changed.

Hope this helps.
Post Reply