Backtester Modules
Posted by Mark on July 11, 2022 at 07:18 | Last modified: June 22, 2022 08:34In Part 1, I reviewed the history and background of this backtester’s dream. I now continue with exploration of the backtesting program in a didactic fashion because as a Python beginner, I am still trying to learn.
Modular programming involves cobbling together individual modules like building blocks to form a larger application. I can understand a few advantages to modularizing code:
- Simplicity is achieved by allowing each module to focus on a relatively small portion of the problem, which makes development less prone to error. Each module is more manageable than trying to attack the whole beast at once.
- Reusability is achieved by applying the module to various parts of the application without duplicating code.
- Independence reduces probability that change to any one module will affect other parts of the program. This makes modular programming more amenable to a programming team looking to collaborate on a larger application.
>
The backtester is currently 290 lines long, which is hardly large enough for a programming team. It is large enough to make use of the following modules, though: os, glob, numpy, datetime, pandas, and matplotlib.pyplot.
I learned about numpy, datetime, pandas, and matplotlib in my DataCamp courses. I trust many beginners are also familiar so I won’t spend dedicated time discussing them.
The os and glob modules are involved in file management. The backtester makes use of option .csv files. According to Python documentation, the os module provides “a portable way of using operating system dependent functionality.” This will direct the program to a specific folder where the data files are located.
The glob module “finds all the pathnames matching a specified pattern according to the rules used by the Unix shell, although results are returned in arbitrary order.” I don’t know what the Unix shell rules are. I also don’t want results returned in arbitrary order. Regardless of what the documentation says, the following simple code works:
I created a folder “test” on my desktop and placed three Excel .csv files inside: 2017.csv, 2018.csv, and 2019.csv. Note how the filenames print in chronological order. For backtesting purposes, that is exactly what I need to happen.
Functions are a form of modular programming. When defined early in the program, functions may be called by name at multiple points later. I did not create any user-defined functions because I was having trouble conceptualizing how. The backtester does perform repetitive tasks, but loops seems sufficient do the work thus far.
If functions are faster, then it may worth making the change to implement them. As I go through the program more closely and further organize my thoughts,* I will be in a better position to make this assessment.
>
*—Don’t get me started on variable scope right now.