Backtester Logic (Part 10)
Posted by Mark on October 4, 2022 at 07:11 | Last modified: June 22, 2022 08:35Today I will continue by following the road map laid out at the end of Part 9.
I will begin with the results file. I have been using Jupyter Notebook for development and I can plot some graphs there, but I want to print detailed results to a .csv file. I am currently generating one such file that shows every day of every trade. Eventually, I want to generate a second file that shows overall trade statistics.
The results file gets opened at the beginning and closed at the end of the program with these lines:
In an earlier version, I then printed to this file as part of the find_short and update_short branches with lines like this:
While I find the syntax interesting, I realized these are pretty much just string operations that won’t help me to calculate higher-level trade statistics. Numpy will be much better for that, which is why I decided to compile the results into btstats (dataframe). Done that way, I can still get the results file in the end with this line:
The dataframe is created near the beginning of the program:
Most of the columns have corresponding variables (see key) and/or are derived through simple arithmetic on those variables.
Now, instead of printing to the results file in two out of the four branches I add a list to the dataframe as a new row:
I searched the internet to find this solution here. One nice thing about Python is that I can find solutions online for most things I’m looking to accomplish. That doesn’t mean I understand how they work, though. For example, I understand df.loc[] to accept labels rather than integer locations (df.iloc[], which I have also learned cannot expand the size of a dataframe). len(btstats.index) is an integer so I’m not sure why it works. This is a big reason why I still consider myself a pupil in Python.
L127 is an example of variable reset (discussed in Part 9). This is what I want to do for every variable once it has served its purpose to make sure I don’t accidentally use old data for current calculations (e.g. next historical date).
Let’s take a closer look at L121:
The data file includes a date field as “number of days since Jan 1, 1970” format. Multiplying that by 86400 seconds/day yields number of seconds since midnight [UTC], Jan 1, 1970, which is the proper format for the datetime module’s UTC timestamp. I can now use the .strftime() method and ‘%b’ to get the first three letters as an abbreviated month name. Being much more readable than a nonsensical 5-digit integer, this is what I want to see in the results file.
The light at the end of the tunnel is getting brighter!
Categories: Python | Comments (0) | Permalink