3

I am trying to implement an evolving neural network on time series Forex data where the model will receive as inputs 3 different exchange rates on a particular timeframe and the base currency will be the same in all 3 inputs (e.g. USD/CHF, USD/JPY and USDZAR all have the same base currency namely USD when used as inputs). The network would then have to predict the direction of the base currency of the next weeks or days (based on timeframe used in data) using based on the inputs. It has the following options for prediction to choose from:

  • Sell
  • Strong Sell
  • Buy
  • Strong Buy

I have tried implementing this in the NEAT python framework based on the simplest example at https://neat-python.readthedocs.io/en/latest/xor_example.html using a very similar config file, representing "Sell" as "0", "Strong Sell" as "1", "Buy" as "2" and "Strong Buy" as "3" in the code. However I haven't been successful since the maximum fitness level I could get on the training set of 84 observations about 75%, this is after I decide to change the NEAT to predict only "Sell" and "Buy" represented by "0" and "1" respectively. I have tried implementing this using a recurrent as well as feedforward type network, not much improvement took place. I have also tried modifying the mutation rates. Kindly assist me with increasing the models fitness level as well as prediction accuracy on the test set. See code below and Thank you.

The data used below is data used here runs over 3 years.

from __future__ import print_function
import neat
import visualize
import random

# 3-input, inputs and expected outputs(training set).
xor_inputs = [(1.1241,1.2976,1.02923013585838), 
(1.1201,1.2434,1.0231225700839),(1.0971,1.2187,1.00999899000101), 
(1.0884,1.2235,1.00644122383253),(1.0987,1.2185,1.01142914938809), 
(1.1138,1.2519,1.03241792277514),(1.0856,1.2602,1.01163378856854),             
(1.0588,1.2346,0.99009900990099),(1.0587,1.2476,0.986679822397632), 
(1.0673,1.2729,0.988630746416214),(1.0562,1.2576,0.982318271119843), 
(1.0451,1.2497,0.974563882662508),(1.0456,1.2292,0.973994350832765), 
(1.0516,1.2338,0.981932443047918),(1.0533,1.2287,0.982028871648826), 
(1.0644,1.2179,0.990982063224656),(1.0703,1.2375,0.998203234178479), 
(1.0699,1.2555,1.000100010001),(1.0784,1.2488,1.00745516824501), 
(1.0641,1.2491,0.997506234413965),(1.0614,1.2414,0.997207818109294), 
(1.0563,1.2468,0.992457324335054),(1.0623,1.2295,0.992457324335053), 
(1.0672,1.2171,0.989119683481701),(1.0738,1.2394,1.00130169219986), 
(1.0799,1.247,1.00867460157353),(1.0652,1.255,0.996909580301067), 
(1.059,1.2373,0.991080277502478),(1.0612,1.2524,0.994629003381739), 
(1.0728,1.2813,1.00411687920474),(1.0897,1.2951,1.0048231511254), 
(1.0998,1.2981,1.01317122593718),(1.0931,1.2891,0.99930048965724), 
(1.1207,1.3035,1.02838338132456),(1.1183,1.2804,1.02616726526424), 
(1.1282,1.2887,1.0391769718383),(1.1196,1.2745,1.03135313531353), 
(1.1198,1.2776,1.02690490860546),(1.1194,1.2718,1.03145951521403), 
(1.1426,1.3027,1.04351455702807),(1.1401,1.2892,1.03734439834025), 
(1.1469,1.3096,1.03788271925272),(1.1663,1.2995,1.05741778576716), 
(1.1752,1.3135,1.03220478943022),(1.1772,1.3036,1.02785486689279), 
(1.1821,1.3012,1.03960910697578),(1.1762,1.2876,1.03626943005181), 
(1.1926,1.2887,1.04515050167224),(1.1859,1.2951,1.03659168653467), 
(1.2035,1.3198,1.05876124933827),(1.1943,1.3593,1.0417751849151), 
(1.1954,1.3493,1.03177878662815),(1.1814,1.3397,1.03284445362528), 
(1.1734,1.3066,1.02249488752556),(1.1823,1.3287,1.02616726526424), 
(1.1785,1.319,1.01595042161943),(1.161,1.3129,1.00230530219505), 
(1.1609,1.3076,0.99930048965724),(1.1665,1.3191,1.00421771440048), 
(1.1795,1.3214,1.01153145862836),(1.193,1.3337,1.02061645233721), 
(1.1891,1.3473,1.02396067990989),(1.1764,1.3392,1.00704934541793), 
(1.1754,1.3321,1.00959111559818),(1.1859,1.3362,1.01081572829273), 
(1.1998,1.3515,1.02627257799672),(1.203,1.3571,1.02543068088597), 
(1.2187,1.3729,1.03284445362528),(1.2222,1.3852,1.03842159916926), 
(1.2421,1.417,1.07100781835707),(1.2462,1.4123,1.07376785139053), 
(1.2235,1.3838,1.06371662589086),(1.2406,1.4042,1.07781849536538), 
(1.2293,1.3967,1.06746370623399),(1.2317,1.3805,1.06587081645705), 
(1.2307,1.3849,1.05130361648444),(1.2289,1.3941,1.0501995379122), 
(1.2353,1.4134,1.05574324324324),(1.2323,1.4018,1.0481081647626), 
(1.2283,1.4092,1.04242676951944),(1.2331,1.4241,1.03906899418121), 
(1.2288,1.4004,1.02606197414324),(1.213,1.378,1.01224820325944), 
(1.196,1.3533,1)]
xor_outputs = [(3,),(3,),(2,),(0,),(1,),(2,),(3,),(2,),(1,),(3,),(3,), 
(2,),(1,),(0,),(0,),(1,),(0,),(0,),(2,),(3,),(2,),(2,),(2,),(1,),(1,), 
(2,),(3,),(1,),(1,),(1,),(1,),(3,),(1,),(3,),(1,),(3,),(0,),(2,),(1,), 
(3,),(1,),(0,),(0,),(2,),(0,),(3,),(1,),(2,),(1,),(2,),(2,),(2,),(3,), 
(1,),(3,),(3,),(3,),(1,),(1,),(1,),(0,),(3,),(2,),(1,),(1,),(0,),(1,), 
(1,),(1,),(0,),(3,),(1,),(3,),(2,),(2,),(2,),(1,),(3,),(2,),(0,),(3,), 
(3,),(3,),(2,)
]
# 3-input, inputs and expected outputs(test set).
xor_inputs2 = [(1.1944,1.3543,0.99940035978413), 
(1.1778,1.3474,1.00220485067148),(1.1652,1.3309,1.01030511214387), 
(1.1661,1.3348,1.0120433154539),(1.1768,1.3412,1.01502233049127), 
(1.1609,1.3285,1.00240577385726),(1.1657,1.327,1.01214574898785), 
(1.1685,1.3209,1.00938730190774),(1.1747,1.3285,1.01020305081321), 
(1.1685,1.3235,0.998302885095338),(1.172,1.3134,1.00745516824501), 
(1.1658,1.3104,1.00553041729512),(1.1567,1.3008,1.00573267625465), 
(1.1411,1.2769,1.00462125778582),(1.1439,1.2752,1.00411687920474), 
(1.1623,1.2845,1.01698362656361),(1.1601,1.2963,1.03220478943022), 
(1.1553,1.2924,1.03167234086454),(1.163,1.3068,1.03359173126615), 
(1.175,1.3077,1.04307916970898),(1.1782,1.3159,1.03626943005181)
]
xor_outputs2 = [(2,),(2,),(1,),(1,),(3,),(0,),(2,),(1,),(3,),(0,), 
(3,),(2,),(3,),(2,),(1,),(0,),(3,),(1,),(1,),(0,),(3,)
]


def eval_genomes(genomes, config): #function Used for training model 
using the training set
   for genome_id, genome in genomes:
       genome.fitness = 84.0
       net = neat.nn.RecurrentNetwork.create(genome, config)
       for xi, xo in zip(xor_inputs, xor_outputs):
           output = net.activate(xi)
           genome.fitness -= (output[0] - xo[0]) ** 2 #Distance from 
the correct output summed for all 84 inputs patterns


# Load configuration.
config = neat.Config(neat.DefaultGenome, neat.DefaultReproduction,
                 neat.DefaultSpeciesSet, neat.DefaultStagnation,
                 'config-feedforward-research_nn')

# Create the population, which is the top-level object for a NEAT run.
p = neat.Population(config)

# Add a stdout reporter to show progress in the terminal.
p.add_reporter(neat.StdOutReporter(True))
stats = neat.StatisticsReporter()
p.add_reporter(stats)
#p.add_reporter(neat.Checkpointer(100))

# Run until a solution is found.
winner = p.run(eval_genomes, 30000)  # run for 30000 to test 
generations

# Display the winning genome.
print('\nBest genome:\n{!s}'.format(winner))

# Make and show prediction on unseen data (test set) using winner NN's 
genome.
print('\nOutput:')
winner_net = neat.nn.RecurrentNetwork.create(winner, config)
for xi, xo in zip(xor_inputs2, xor_outputs2):
  output = winner_net.activate(xi)
  print("  input {!r}, expected output {!r}, got {!r}".format(
    xi, xo, output))

node_names = {-1: 'Input 1', -2: 'Input 2', -3: 'Input 3', 0: 
'Prediction'}
visualize.draw_net(config, winner, True, node_names=node_names,
               filename="winner-feedforward-research_nn.gv")

The config file:

[NEAT]
# changed
fitness_criterion     = max
fitness_threshold     = 75 
pop_size              = 500 
reset_on_extinction   = False

[DefaultGenome]
# node activation options 
# changed
activation_default      = sigmoid 
activation_mutate_rate  = 0.0
activation_options      = sigmoid, square 

# node aggregation options
aggregation_default     = sum
aggregation_mutate_rate = 0.0
aggregation_options     = sum

# node bias options
bias_init_mean          = 0.0
bias_init_stdev         = 1.0
bias_max_value          = 30.0
bias_min_value          = -30.0
bias_mutate_power       = 0.5
bias_mutate_rate        = 0.7
bias_replace_rate       = 0.1

# genome compatibility options
compatibility_disjoint_coefficient = 1.0
compatibility_weight_coefficient   = 0.5

# connection add/remove rates
conn_add_prob           = 0.5
conn_delete_prob        = 0.5

# connection enable options
# changed
enabled_default         = True
enabled_mutate_rate     = 0.04 

feed_forward            = false
initial_connection      = full_direct

# node add/remove rates
# changed
node_add_prob           = 0.5 
node_delete_prob        = 0.5 

# network parameters
# changed
num_hidden              = 0
num_inputs              = 3 
num_outputs             = 1

# node response options
response_init_mean      = 1.0
response_init_stdev     = 0.0
response_max_value      = 30.0
response_min_value      = -30.0
response_mutate_power   = 0.0
response_mutate_rate    = 0.0
response_replace_rate   = 0.0

# connection weight options
# changed
weight_init_mean        = 0.0
weight_init_stdev       = 1.0
weight_max_value        = 30
weight_min_value        = -30
weight_mutate_power     = 0.5
weight_mutate_rate      = 0.9
weight_replace_rate     = 0.1

[DefaultSpeciesSet]
compatibility_threshold = 3.0

[DefaultStagnation]
species_fitness_func = max
max_stagnation       = 20
species_elitism      = 2

[DefaultReproduction]
elitism            = 2
survival_threshold = 0.2
Kusi
  • 785
  • 1
  • 10
  • 21
  • 1
    I've been working on a very similar approach using NEAT only for the stock market instead of FX. There really doesn't seem to be that much documentation out there for this particular use case. I'm not very familiar with these algorithms yet, but I'm curious if you have made any progress on the issue you mentioned? – jblew Dec 01 '18 at 05:30
  • Yes I actually have made progress. Transforming the data using Min Max transformation was useful because it brought all inputs to the same range [0,1] improving model forecast ability. Also I had to adjust the fitness function in the eval_genome function because it was slightly incorrect. – Kusi Dec 02 '18 at 04:24
  • What was wrong with the fitness function? Also, did you let it run for 30,000 generations? That would definitely take a long time! – jblew Dec 02 '18 at 05:09
  • 1
    I took the number of generations back to 1000, I noticed that after that an ANN structures proposed tend to be unneccessary complicated. The original fitness function does not take into account that I am working with expected outputs that are represented as catogory numbers 1 and 0. This means that the estimation made by the ANN needs to be rounded before it is used, otherwise you get false indications of fitness for proposed ANN models. – Kusi Dec 04 '18 at 14:18
  • I'm currently coding on a very similar project; regarding the fitness function. Mine is essentially the same. Now, when running, the fitness flattens out fast (50-100 Gen's) and the results are still not satisfying. Do 1000 Generations really make it better? I've already scaled all inputs between 0 and 1, like you did. Which values in the config do you change first? (I couldn't really find good ressources for neat regarding this...) – Marco Dec 29 '18 at 15:28
  • 1
    In my opinion 800-1000 generations could be useful but anymore would not be useful. You can change the "feed_forward" option to True which enables recurrent connections and normal betters performance. Then you can change the "initial_connection" option to "full_direct" or "direct", normally full_direct works better. Also add mutation to the activation function by changing the mutation_rate to 0.01, this could also work. – Kusi Dec 29 '18 at 17:44
  • 1
    @Kusi As for the config_file, changing the "mutation_rate" made a big difference like you said. Using the XOR example from the docs leaves the mutation_rate set at 0, and I found changing it to 0.01 fixed the problems I was dealing with. – jblew Dec 30 '18 at 21:19

0 Answers0