a typical Freqtrade strategy is created) as well as a target value (typically some price change into the future). FreqAI trains a model to predict the target value based on the input of custom indicators.
Users employ FreqAI to backtest a strategy (emulate reality with retraining a model as new data is introduced) and run the model live to generate buy and sell signals. In dry/live, FreqAI works in a background thread to keep all models as updated as possible with consistent retraining.
The table below will list all configuration parameters available for `FreqAI`.
Mandatory parameters are marked as **Required**, which means that they are required to be set in one of the possible ways.
| Parameter | Description |
|------------|-------------|
| `freqai` | **Required.** The dictionary containing all the parameters for controlling FreqAI. <br>**Datatype:** dictionary.
| `identifier` | **Required.** A unique name for the current model. This can be reused to reload pretrained models/data. <br>**Datatype:** string.
| `train_period_days` | **Required.** Number of days to use for the training data (width of the sliding window). <br>**Datatype:** positive integer.
| `backtest_period_days` | **Required.** Number of days to inference into the trained model before sliding the window and retraining. This can be fractional days, but beware that the user provided `timerange` will be divided by this number to yield the number of trainings necessary to complete the backtest. <br>**Datatype:** Float.
| `live_retrain_hours` | Frequency of retraining during dry/live runs. Default set to 0, which means it will retrain as often as possible. **Datatype:** Float > 0.
| `follow_mode` | If true, this instance of FreqAI will look for models associated with `identifier` and load those for inferencing. A `follower` will **not** train new models. False by default. <br>**Datatype:** boolean.
| `live_trained_timestamp` | Useful if user wants to start from models trained during a *backtest*. The timestamp can be located in the `user_data/models` backtesting folder. This is not a commonly used parameter, leave undefined for most applications. <br>**Datatype:** positive integer.
| `fit_live_predictions_candles` | Computes target (label) statistics from prediction data, instead of from the training data set. Number of candles is the number of historical candles it uses to generate the statistics. <br>**Datatype:** positive integer.
| `purge_old_models` | Tell FreqAI to delete obsolete models. Otherwise, all historic models will remain on disk. Defaults to False. <br>**Datatype:** boolean.
| | **Feature Parameters**
| `expiration_hours` | Ask FreqAI to avoid making predictions if a model is more than `expiration_hours` old. Defaults to 0 which means models never expire. <br>**Datatype:** positive integer.
| `feature_parameters` | A dictionary containing the parameters used to engineer the feature set. Details and examples shown [here](#building-the-feature-set) <br>**Datatype:** dictionary.
| `include_corr_pairlist` | A list of correlated coins that FreqAI will add as additional features to all `pair_whitelist` coins. All indicators set in `populate_any_indicators` will be created for each coin in this list, and that set of features is added to the base asset feature set. <br>**Datatype:** list of assets (strings).
| `include_timeframes` | A list of timeframes that all indicators in `populate_any_indicators` will be created for and added as features to the base asset feature set. <br>**Datatype:** list of timeframes (strings).
| `label_period_candles` | Number of candles into the future that the labels are created for. This is used in `populate_any_indicators`, refer to `templates/FreqaiExampleStrategy.py` for detailed usage. The user can create custom labels, making use of this parameter not. <br>**Datatype:** positive integer.
| `include_shifted_candles` | Parameter used to add a sense of temporal recency to flattened regression type input data. `include_shifted_candles` takes all features, duplicates and shifts them by the number indicated by user. <br>**Datatype:** positive integer.
| `DI_threshold` | Activates the Dissimilarity Index for outlier detection when above 0, explained more [here](#removing-outliers-with-the-dissimilarity-index). <br>**Datatype:** positive float (typically below 1).
| `weight_factor` | Used to set weights for training data points according to their recency, see details and a figure of how it works [here](##controlling-the-model-learning-process). <br>**Datatype:** positive float (typically below 1).
| `principal_component_analysis` | Ask FreqAI to automatically reduce the dimensionality of the data set using PCA. <br>**Datatype:** boolean.
| `use_SVM_to_remove_outliers` | Ask FreqAI to train a support vector machine to detect and remove outliers from the training data set as well as from incoming data points. <br>**Datatype:** boolean.
| `svm_nu` | The `nu` parameter for the support vector machine. *Very* broadly, this is the percentage of data points that should be considered outliers. <br>**Datatype:** float between 0 and 1.
| `stratify_training_data` | This value is used to indicate the stratification of the data. e.g. 2 would set every 2nd data point into a separate dataset to be pulled from during training/testing. <br>**Datatype:** positive integer.
| `indicator_max_period_candles` | The maximum *period* used in `populate_any_indicators()` for indicator creation. FreqAI uses this information in combination with the maximum timeframe to calculate how many data points it should download so that the first data point does not have a NaN <br>**Datatype:** positive integer.
| `indicator_periods_candles` | A list of integers used to duplicate all indicators according to a set of periods and add them to the feature set. <br>**Datatype:** list of positive integers.
| | **Data split parameters**
| `data_split_parameters` | include any additional parameters available from Scikit-learn `test_train_split()`, which are shown [here](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html) <br>**Datatype:** dictionary.
| `test_size` | Fraction of data that should be used for testing instead of training. <br>**Datatype:** positive float below 1.
| `shuffle` | Shuffle the training data points during training. Typically for time-series forecasting, this is set to False. **Datatype:** boolean.
| | **Model training parameters**
| `model_training_parameters` | A flexible dictionary that includes all parameters available by the user selected library. For example, if the user uses `LightGBMPredictionModel`, then this dictionary can contain any parameter available by the `LightGBMRegressor` [here](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMRegressor.html). If the user selects a different model, then this dictionary can contain any parameter from that different model. <br>**Datatype:** dictionary.
| `n_estimators` | A common parameter among regressors which sets the number of boosted trees to fit <br>**Datatype:** integer.
| `learning_rate` | A common parameter among regressors which sets the boosting learning rate. <br>**Datatype:** float.
| `n_jobs`, `thread_count`, `task_type` | Different libraries use different parameter names to control the number of threads used for parallel processing or whether or not it is a `task_type` of `gpu` or `cpu`. <br>**Datatype:** float.
Here are the values you can expect to receive inside the dataframe returned by FreqAI:
| Parameter | Description |
|------------|-------------|
| `&-s*` | user defined labels in the user made strategy. Anything prepended with `&` is treated as a training target inside FreqAI. These same dataframe columns names are fed back to the user as the predictions. For example, the user wishes to predict the price change in the next 40 candles (similar to `templates/FreqaiExampleStrategy.py`) by setting `&-s_close`. FreqAI makes the predictions and gives them back to the user under the same key (`&-s_close`) to be used in `populate_entry/exit_trend()`. <br>**Datatype:** depends on the output of the model.
| `&-s*_std/mean` | The standard deviation and mean values of the user defined labels during training (or live tracking with `fit_live_predictions_candles`). Commonly used to understand rarity of prediction (use the z-score as shown in `templates/FreqaiExampleStrategy.py` to evaluate how often a particular prediction was observed during training (or historically with `fit_live_predictions_candles`)<br>**Datatype:** floats.
| `do_predict` | An indication of an outlier, this return value is integer between -1 and 2 which lets the user understand if the prediction is trustworthy or not. `do_predict==1` means the prediction is trustworthy. If the [Dissimilartiy Index](#removing-outliers-with-the-dissimilarity-index) is above the user defined treshold, it will subtract 1 from `do_predict`. If `use_SVM_to_remove_outliers()` is active, then the Support Vector Machine (SVM) may also detect outliers in training and prediction data. In this case, the SVM will also subtract one from `do_predict`. A particular case is when `do_predict == 2`, it means that the model has expired due to `expired_hours`. <br>**Datatype:** integer between -1 and 2.
| `DI_values` | The raw Dissimilarity Index values to give the user a sense of confidence in the prediction. Lower DI means the data point is closer to the trained parameter space. <br>**Datatype:** float.
legnth of `include_timeframes`* no. features in `populate_any_indicators()` * legnth of `include_corr_pairlist`* no. `include_shifted_candles` * length of `indicator_periods_candles`_
FreqAI has multiple example prediction model based libraries such as `Catboost` regression (`freqai/prediction_models/CatboostPredictionModel.py`) and `LightGBM` regression. However, users can customize and create
their own prediction models using the `IFreqaiModel` class. Users are encouraged to inherit `train()` and `predict()` to let them customize various aspects of their training procedures.
Model training parameters are unqiue to the library employed by the user. FreqAI allows users to set any parameter for any library using the `model_training_parameters` dictionary in the user configuration file. The example configuration files show some of the example parameters associated with `Catboost` and `LightGBM`, but users can add any parameters available in those libraries.
Data split parameters are defined in `data_split_parameters` which can be any parameters associated with `Sklearn`'s `train_test_split()` function. Meanwhile, FreqAI includes some additional parameters such `weight_factor` which allows the user to weight more recent data more strongly
`train_test_split()` has a parameters called `shuffle`, which users also have access to in FreqAI, that allows them to keep the data unshuffled. This is particularly useful to avoid biasing training with temporally autocorrelated data.
Finally, `label_period_candles` defines the offset used for the `labels`. In the present example,