DL FX Forecasting
Python project for forecasting changes in several Foreign Exchange (FX) pairs.
About the project 
FX rates forecasting in ultra high frequency setting, using Deep Learning techniques. The main focus of the research is to predict the increments in the next few seconds for a set of different FX pairs.
Prerequisites 
Environment
Execute the following command to start the container:
docker run -it --rm jpxkqx/dl-fx-forecasting:firsttry
In case, the data is already in processed in the host machine, the following command may be more appropriate.
docker run -it -v "/path/to/data:/app/data" --rm jpxkqx/dl-fx-forecasting:firsttry
The path /path/to/data refers to the directory containing the data as presented in the project organization below. In case all processed information is available, it is possible to execute all scripts.
Data 
Read, load, preprocess and save the data for the currency pair specified. To go through this pipeline, the ZIP files have to be in the host machine, and the path to the folder containing this data must be specified as an environment variable called PATH_RAW_DATA. The following command process the data available in the host machine for currency pair EUR/USD.
generate_datasets eur usd
In this case, the historical data has been extracted from True FX, whose first prices are shown below.
FX pair | Timestamp | Low | High |
---|---|---|---|
EUR/USD | 20200401 00:00:00.094 | 1.10256 | 1.10269 |
EUR/USD | 20200401 00:00:00.105 | 1.10257 | 1.1027 |
EUR/USD | 20200401 00:00:00.193 | 1.10258 | 1.1027 |
EUR/USD | 20200401 00:00:00.272 | 1.10256 | 1.1027 |
EUR/USD | 20200401 00:00:00.406 | 1.10258 | 1.1027 |
EUR/USD | 20200401 00:00:00.415 | 1.10256 | 1.1027 |
EUR/USD | 20200401 00:00:00.473 | 1.10257 | 1.1027 |
EUR/USD | 20200401 00:00:00.557 | 1.10255 | 1.10268 |
This data is processed by the following command, which computes the mid price and spread and filter some erroneus data points. The processed information is stored using Apache Parquet in order to achieve faster reading times.
Visualizations 
Then, plot the currency pair EUR/USD for the period from 25 May, 202 to 30 May, 2020.
plot_currency_pair eur usd mid H T S --period 2020-05-25 2020-05-31
To get the following image,
There is also the possibility to plot the cumulative distribution function using the following command
plot_cdf eur usd increment --period 2020-04-01 2020-06-01
which gives the image shown below,
In order to plot the distribution of the main daily statistic of the spread, the following command can be used.
plot_stats eur usd spread D --period 2020-04-01 2020-06-01
In addition, the correlation between the different currency pairs aggregated by any timeframe can also be plotted for any given period of time.
plot_pair_correlations increment --period 2020-04-01 2020-06-01 --agg_frame H
Lastly, the correlation between currency pairs is represented as follows,
plot_pair_acf increment eur usd --agg_frame 'H' --period 2020-04-01 2020-06-01
Modelling
Results 
Project Organization 
โโโ LICENSE
โโโ Makefile <- Makefile with commands like `make data` or `make train`
โโโ README.md <- The top-level README for developers using this project.
โโโ data
โย ย โโโ external <- Data from third party sources.
โย ย โโโ interim <- Intermediate data that has been transformed.
โย ย โโโ processed <- The final, canonical data sets for modeling.
โย ย โโโ raw <- The original, immutable data dump.
โ
โโโ docs <- A deafult MkDocs project.
| โโโ index.md
โ
โโโ models <- Trained and serialized models, model predictions, or model summaries
| โโโ configurations <- YAML files with model configurations
| โโโ features <- Contains model selection results, test results and fitted models, under the path
| | models/features/{ model }/{ fx_pair }/{ aux_pair}/{ variables concat with _}
| | In particular, the models used EWMA's of a fixed number of past observations.
โ โโโ raw <- Contains model selection results, test results and fitted models, under the path
| models/features/{ model }/{ fx_pair }/{ aux_pair}/{ variables concat with _}
| In particular, the models used all the past observations.
โ
โโโ notebooks <- Jupyter notebooks. Containing the results for the training process of diffferent models
| โโโ train...hmtl <- Output code to include in VC.
โ โโโ train...ipynb <- Python notebooks considered. Not included in VC.
|
|
โโโ reports <- Generated analysis as HTML, PDF, LaTeX, etc.
โย ย โโโ figures <- Generated graphics and figures to be used in reporting, README, and docs
โย ย โโโ images <- Generated graphics and figures of EDA. Not included in VC.
โย ย โโโ models <- Generated graphics and figures of model results. Not included in VC
โ
โโโ requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
โ generated with `pip freeze > requirements.txt`
โ
โโโ setup.py <- makes project pip installable (pip install -e .) so src can be imported
โโโ src <- Source code for use in this project.
โย ย โโโ __init__.py <- Makes src a Python module
โ โ
โย ย โโโ data <- Scripts to download or generate data
โย ย โย ย โโโ __init__.py
โย ย โย ย โโโ data_extract.py
โย ย โย ย โโโ data_loader.py
โย ย โย ย โโโ data_preprocess.py
โย ย โย ย โโโ utils.py
โย ย โย ย โโโ constants.py
โ โ
โย ย โโโ features <- Scripts to turn raw data into features for modeling
โย ย โย ย โโโ __init__.py
โย ย โย ย โโโ get_blocks.py
โย ย โย ย โโโ build_features.py
โ โ
โย ย โโโ models <- Scripts to train models and then use trained models to make
โ โ โ predictions
โย ย โย ย โโโ __init__.py
โย ย โย ย โโโ neural_network.py
โย ย โย ย โโโ model_selection.py
โย ย โย ย โโโ model_utils.py
โย ย โย ย โโโ train_model.py
โ โ
โย ย โโโ scripts <- Scripts to create CLI entrypoints
โย ย โย ย โโโ __init__.py
โย ย โย ย โโโ click_utils.py
โย ย โย ย โโโ generate_datasets.py
โย ย โย ย โโโ plot_currency_pair.py
โย ย โย ย โโโ plot_pair_correlations.py
โ โ
โย ย โโโ visualization <- Scripts to create exploratory and results oriented visualizations
| โโโ __init__.py
| โโโ line_plot.py
| โโโ plot_correlations.py
| โโโ plot_results.py
โย ย โโโ currency_pair.py
โย ย
โโโ tests
โ ย โโโ data <- Data needed to test the functionalities.
โ ย โโโ mocks.py
โ ย โโโ test_cli_scripts.py
โ ย โโโ test_dataset_generation.py
โ ย โโโ test_visualization.py
โ
โโโ tox.ini <- tox file with settings for running tox; see tox.readthedocs.io