All data and plots discussed in this post are taken from this spreadsheet:
https://docs.google.com/spreadsheets/d/1BbpD81mugWQf10ozGDLI60takAn3tlvrq8ksT10yV5I/edit?usp=sharing
Run Details:
This optimization run was done to do a course search for the optimal set of parameters to run the GA with. This run used a fixed 100-individual population with a simulated error of 0.25 over 50 generations, to best replicate the current PUEO-loop environment. The selection methods have also been held fixed in this run. This run was tested using both the ARA and PUEO design to see if there were similarities in the best run type. This run encompassed 210 different run combinations that searched the following parameter space:
- 0-16 Reproduction, step sizes of 4
- 72-96 Crossover, step sizes of 4
- 4-16 Mutation, step sizes of 4
- 5-15 Sigma, step sizes of 5
Results:
Each run combination was run 10 times and we tracked the average number of generations it took for the distance metric to reach 0.05, which corresponds to a 0.95 true fitness score. The best runs for each design were as followed:
- ARA
- Parameters
- 12 Reproduction
- 72 Crossover
- 4 Mutation
- 5 Sigma
- Average gens to benchmark
- Standard Deviation
- PUEO
- Parameters
- 0 Reproduction
- 96 Crossover
- 4 Mutation
- 15 Sigma
- Average gens to benchmark
- Standard Deviation
We can see that these run types do not appear to have any strong correlations. In fact, looking at our complete data set, there does not appear to be a strong trend around any of the best runs. This issue became more prominent when we ran these best two run types for 100 tests rather than 10. When we did this, they returned averages of 44.4 generations for PUEO and 46.9 generations for ARA. For comparison, the average number of generations across all run types was 43.1 generations for PUEO and 42.8 generations for ARA. This seems to suggest that all of these run types are inherently inconsistent and regress to the mean given enough tests. Therefore, I do not believe we can draw any conclusions from this run.
Moving Forward:
Our ongoing hypothesis is that the size of the error could be causing inconsistency in how quickly our population can grow over time. Therefore, our next step is to repeat this experiment, but with zero error included, and see if we achieve results that show consistent behavior. If we can find more consistency in runs with no error, then we can more deeply explore the effects that error can have on GA growth. If there continues to be a lack of consistency, then we can look to other factors, such as population size, and try to find the root of the inconsistency.
|