Large data jumps

Threshold · Post by **Threshold** » Sun Feb 23, 2014 12:01 am

On a few pairs I've downloaded there has definitely been some mistakes in the data.

Some bars are like 0.1234, 254.1234, 0.1234, 0.1234....
Which is in dire need of editing. I have edited all of them view MT4 history center which took a very very long time. Especially since I have 2 different servers. 1 with full tick data, another with suppressed. With the Full tick server i can easily just edit the M1 HST file and then create the other time frames from it with the period converter script. With the suppressed tick data i need to edit m1 m5 m15 m40 H1... to keep their ticks suppressed.

Anyway, are HUGE spikes like this normal? There are many. This is what low quality data has. MT4 modeling quality 90/99% means nothing. Spikes like this are the real important measure of good or bad data.

Is this a glitch? Is everyone getting this from Duka servers?

Post by **tickstory** » Mon Feb 24, 2014 12:52 am

Hi Threshold,

Yes, the anomalies appear to be in the original Dukascopy data. Unfortunately we can't comment on why they are there, except to say that later versions of Tickstory will help you filter them out automatically.

Perhaps other users have further comments about this.

Regards.

Threshold · Post by **Threshold** » Mon Feb 24, 2014 4:01 pm

Am I able to remove these from .FXT or the original Duka BI5 files?

I can edit .HST as you know but then I would have to overwrite tickstory's FXT and create new ones via MT4.
Has there been any solutions to getting these out?

Should I edit .HST, produce new FXT with MT4 backtest, then run from tickstory in the future after?
Currently its impossible to test on this data.

: Untitled.png (8.48 KiB) Viewed 21319 times

Post by **tickstory** » Tue Feb 25, 2014 1:43 am

Hi Threshold - unfortunately there is no feature to allow you to edit the original Dukascopy data.
The only options we can think of in the short term is to select an historical date range and generate the HST/FXT files for this, edit them and then use them to test the selected date range. Alternatively, you can generate a CSV file, edit that and use this to generate the HST/FXT files using third-party tools such as Birt's CSV2FXT.

Hope this helps.

Threshold · Post by **Threshold** » Tue Feb 25, 2014 9:39 pm

A side query: You said latest versions of tickstory has filters implemented to help eliminated some of these anomalies. Does this have options? Can you tell me a bit about this, how it works?
----------------------------------------

Do you know of any spreadsheet programs that can handle files of this size?

Excel stops importing @ 1,048,576 rows and will not import any more after that even on additional sheets. I know there are some solutions such as file splitting I think there is a better solution.

I think the best solution, if I cannot edit the .CSV as easily as a .HST, would be to just edit my .HST files in MT4 and make my own .FXT files and pretty much backtest the old fashioned way since Duka's default data is too unreliable for good tests. I must say tickstory is brilliant and I do love it! The duka data is a real shame however.

Post by **tickstory** » Wed Feb 26, 2014 12:54 am

Hi Threshold - the data filtering capabilities are still under development. Essentially it will work by allowing users to filter data such as duplicate ticks and also "anomalous" data. If you have some specific ways in which you wish to filter you data, please feel free to add them in the 'suggestions' area so we can consider how to implement them.

With regards to Excel - yes, this is an issue with the row limitation and we're not sure of the best way to manage large files which is why we believe these sorts of features would be useful for Tickstory.

Regards.

Threshold · Post by **Threshold** » Fri Jun 19, 2015 10:26 pm

I'm back from the dead.

So was this implemented into the "duplicate ticks" tab?

If not-

1 are these anomalies in the dukascopy BI5 files or are they an export/tickstory issue?
We can solve this by seeing if multiple users, just 2 instances of the same anomaly will be enough, to prove its the BI5 files or tickstory export.

If we reach the conclusion its the BI5 file (but I for some reason don't think it is) then the best solution would be to delete those ticks (since its only very few which actually cause this) if the standard deviation is more than 10 or something huge like that, since these spikes tend to all be enormous. This will stop 99.9% of all real world events from being filtered out. The only exception that comes to mind is the CHF currency peg removal earlier this year but it wouldn't be very difficult to add an exception rule for extremely rare events such as that.

I believe that it might not be the BI5 files is because SQ Tick Data downloader doesn't have these, but its possible it has a built in filter. (?)
My reason of preference toward tickstory though is the FXT export and MT4 launcher, which SQ Tickdata downloader doesn't support.

Post by **tickstory** » Sat Jun 20, 2015 2:32 am

Hi Threshold,

Tickstory downloads Dukascopy data as-is - I suggest you go to the Dukascopy website and confirm for yourself (as we have) that the anomalous data points exist in their repository:

https://www.dukascopy.com/swiss/english ... istorical/

It is possible that Dukascopy have filtered their candle-stick data (eg. 15 min) to remove these points and perhaps that is why other software is not picking this up. On the other hand, a couple of users have reported that this Dukascopy candle-stick data has various problems of its own (such as incorrect time-shifts). We prefer to work with the raw tick data since that way we ultimately know what we are giving to our users. This also means when we do eventually add the data filter option, we can be confident that you wont see differences like this between the tick data and candle-stick data.

Regards.

Tickstory

Large data jumps

Large data jumps

Re: Large data jumps

Re: Large data jumps

Re: Large data jumps

Re: Large data jumps

Re: Large data jumps

Re: Large data jumps

Re: Large data jumps