AbstractOffshore wind energy is a promising option for emission-free power generationbecause its potential can satisfy the entire US energy demand. However, deep offshore wind energy is mostly left untapped due to the high levelized cost of energy (LCOE) of floating offshore wind turbines (FOWTs). To address the challenge, individual blade pitch control (IPC) of each blade is necessary to reduce fatigue due to the nonlinear dynamics involving unbalanced nonstationary wind/wave loading. In this paper, a machine learning control (MLC) method is proposed that utilizes a genetic program to selectively evolve promising control law candidates from simulated FOWT sensor data. The proposed method utilizes unique MLC features including an efficient selection of design load cases (DLCs) to accelerate the evolution, DLC-based elitism to identify promising candidates, and the use of internal state feedback to learn effective controller properties across all the target DLCs. These features are used to reduce evaluation time on the complex nonlinear FOWT model. The feasibility of the method is demonstrated by reducing fatigue and ultimate loading by 41% under the representative DLCs provided by the Aerodynamic Turbines with Load Attenuation Systems (ATLAS) competition hosted by the Advanced Research Projects Agency–Energy (ARPA-E). Unlike the methods founded upon black-box models [e.g., artificial neural networks (ANNs)], the proposed method provides interpretable results; thus, it can contribute to learning important design features to guide the future controller design. For example, the best individual driven during the feasibility study exhibits a nonlinear proportional controller to a sigmoidlike platform pitch signal, which can be a starting point for more advanced IPC design for FOWTs. The proposed method will improve the cost-effectiveness of FOWTs and be applied to similar nonlinear control problems, such as unmanned aerial vehicle (UAV) control.IntroductionOffshore wind energy is a promising emissions-free energy source to satisfy increasing energy demands nationwide and the total technical potential of offshore wind energy could supply the entire US demand for electricity. However, according to the Advanced Research Projects Agency–Energy (ARPA-E), harvesting offshore wind energy has been challenging because most of the energy lies in water deeper than 60 m (ARPA-E 2019a). To take advantage of offshore wind energy, floating offshore wind turbines (FOWTs) have been developed by integrating the technologies involving onshore wind turbine towers and floating foundations. Nevertheless, the high levelized cost of energy (LCOE) of FOWTs, which is about three times higher than that of onshore wind turbines (Stehly et al. 2018), is a major barrier to their broader adoption. Therefore, significant improvement in the cost-effectiveness (i.e., efficiency and reliability) of FOWTs is the key to unleashing the potential of clean and abundant offshore wind energy.Turbine blade pitch control plays an important role in improving the cost-effectiveness of wind turbines by reducing fatigue loading without compromising power generation (Njiri and Söffker 2016). Turbine blade pitch control approaches can be broadly classified into two categories: collective pitch control (CPC) and individual pitch control (IPC). CPC is a conventional paradigm of pitch control that controls the pitch of all the turbine blades with a single pitch command value (Plumley et al. 2014). CPC is typically used for the speed regulation set by a central controller, and it has been effective if the wind loading is uniform across the entire rotor (Wheeler and Garcia-Sanz 2017). However, in reality, wind loading across the rotor is unbalanced and constantly changing, which contributes to a significant reduction of turbine blade lifetime (Lu et al. 2015).IPC is a relatively recent paradigm that aims at controlling individual blade pitches with unique pitch commands to address such unbalanced loadings (Plumley et al. 2014). IPC is known to improve speed regulation by supplementing CPCs, as well as increase the blade longevity by reducing unbalanced loads (Wheeler and Garcia-Sanz 2017). Currently, the Coleman transform–based IPC strategy is widely adopted, which is known to be effective under the assumption that the turbine blade dynamics (or its analytical model) can be reasonably approximated via linearization (Lio et al. 2017). The existing works using Coleman transform include those of Wheeler and Garcia-Sanz (2017), Mirzaei et al. (2013) (coupled with model predictive control), and Selvam et al. (2009) (coupled with linear-quadratic-Gaussian control).However, FOWTs have floating foundations with a substantial heave and pitch movement; thus, linearization may not be accurate across their operating range. In other words, FOWTs involve complex nonlinear dynamics across the wave, wind, and turbine blade components, which leads to important technological difficulties in their control. Additionally, the controller should be robust to ensure the operation of FOWTs under different wind and wave loading profiles. Therefore, there is a strong need for a novel IPC strategy that can properly handle the complex nonlinear dynamics involving FOWTs.Machine learning methods, such as those based on evolutionary algorithms (EAs) and artificial neural networks (ANNs), have been developed to address similar complex nonlinear problems. For example, machine learning methods have been used in fields such as autonomous vehicle control (Tuncali et al. 2018), building thermal control (Seyedzadeh et al. 2018), vibration reduction control (Li and Zhao 2019), and also blade pitch control problems (Chen et al. 2020). The success of machine learning in control has yielded a research field named machine learning control (MLC) (Duriez et al. 2017), where machine learning methods are applied to optimally solve control problems, especially for complex nonlinear systems.Motivated from the recent success of MLC in the control of nonlinear dynamics and turbulence (Duriez et al. 2017), this paper proposes an MLC method based on EA to develop IPC controllers for FOWTs. Specifically, founded upon the existing MLC method for turbine pitch control (Kane 2020), the proposed method improves the efficient selection of design load cases (DLCs) and preserves the mathematical characteristics of promising individuals to accelerate the evolution of control law candidates with a minimal number of simulations. Furthermore, an internal state feedback loop was implemented into individuals to allow for an internal integral feedback loop allowing for access to proportional-integral (PI) control. The proposed method also improves the parallelization of control law candidates generation and evaluation by utilizing the MATLAB parallel computing toolbox.The main contributions of the proposed method are as follows. First, because the proposed method is data-driven, it can develop an effective IPC controller accommodating all the target DLCs and turbine component properties based on the sensor data from simulation environments. This contributes to improving the cost-effectiveness of FOWTs regardless of geographical locations and environmental conditions, unleashing the potential of offshore wind energy. In addition, because the proposed method does not require the analytic model of a system (e.g., it need not derive derivatives), it can be readily extended without the understanding of the dynamics of the target system. For example, the proposed method can be applied to the domains where complex nonlinear dynamics are the significant challenges, such as unmanned aerial vehicle (UAV) control.Furthermore, because the proposed method provides interpretable results, it also adds value to those who understand the target system. For example, human engineers can interpret the resultant control law to derive additional knowledge, which can guide the design of similar controllers. This is an important advantage over the other methods founded upon black-box models (e.g., ANN).In the meantime, solving a data-driven model to learn the nonlinear dynamics of the target system takes computer simulations of complex multiple interconnected components. Because EA follows an iterative process, which requires thousands of computer simulations, it is extremely time-consuming to converge to a result. Therefore, the proposed method focuses on reducing the computational burden of the data-driven approach to learn an effective IPC controller from simulated sensor data.BackgroundAerodynamic Turbines with Load Attenuation Systems CompetitionIn the spring of 2019, ARPA-E hosted the Aerodynamic Turbines with Load Attenuation Systems (ATLAS) competition (ARPA-E 2019b) to challenge contestants to design an IPC controller that reduces the LCOE of a provided wind turbine model under 12 representative DLCs. The ATLAS competition provided a simulation environment in MATLAB that integrates a Simulink controller model founded upon OpenFAST version 1.0.0, an open-source wind turbine simulation tool developed by National Renewable Energy Laboratory (NREL) (NREL 2021). Contestants were provided with the necessary OpenFAST input files for the NREL 5-MW reference wind turbine (Jonkman et al. 2009), and the OC3 Hywind floating spar platform (Jonkman 2010). The environment creates a closed-loop simulation where Simulink computes the controller and pitch actuator outputs, sends them to OpenFAST, and receives the OpenFAST time-series sensor data as the input for the closed-loop controller. In the rest of this section, important components of the ATLAS competition environment are presented: Target DLCs, Baseline CPC controller, and cost function. For more details on the ATLAS competition environment, the readers are referred to the official documentation (ARPA-E 2019b).Target Design Load CasesThe ATLAS competition provided contestants with 12 representative DLCs, which were based on common loading requirements. These DLCs provided a broad sweep of conditions for testing while still allowing for shorter simulation times than exhaustive testings of DLCs from standards [e.g., standards from International Electrotechnical Commission, such as IEC 61400 (IEC 2019), have about 27 DLCs]. Important features of the DLCs are given in Table 1. These include normal operation conditions (DLCs 1–5 and 9–11), extreme coherent gust with direction change (ECD) (DLC 6), extreme wind shear (EWS) (DLC 7), extreme operating gust (EOG) (DLC 8), and step increases that are known to excite the floating platform pitch motion (DLC 12). These 12 DLCs have mean wind speeds of 13.4 m/s or higher, which is above the 5-MW turbine rated speed of 11.4 m/s, thus focusing on the pitch-controlled zone of turbine operation. Other factors such as yaw error (DLCs 1 and 3) and reduced blade airfoil performance (DLCs 9–11) were also investigated.Table 1. Important features of the design load cases in this paperTable 1. Important features of the design load cases in this paperDLC indexMean wind speed (m/s)Yaw errorInflow wind typeAirfoil quality113.4YesTurbulentGood213.4NoTurbulentGood319.4YesTurbulentGood419.4NoTurbulentGood523.4NoTurbulentGood613.4NoExtreme coherent gust with direction change (ECD)Good713.4NoExtreme wind shear (EWS)Good813.4NoExtreme operating gust (EOG)Good913.4NoTurbulentDirty1019.4NoTurbulentDirty1123.4NoTurbulentDirty1215.4NoStep increaseGoodBaseline CPC ControllerThe ATLAS environment model comes with a baseline CPC controller. This controller is designed to maintain generator operating speed at above rated wind speeds, but not for reducing fatigue and ultimate loading. The baseline CPC controller is a PI controller, which uses the generator speed as the error input and the current blade pitch angle as a scheduled gain. Additionally, the controller has an antiwindup loop for the output to prevent saturation of the integrator. The output signal (a blade pitch angle) is sent to all three turbine blades for collective control.Cost FunctionThe ATLAS competition was judged based on the contestants’ ability to minimize a provided cost function (CF) [Eq. (1)], evaluated with OpenFAST (NREL 2021). The cost function is designed to represent the magnitude of fatigue and ultimate loads on the turbine divided by the annual energy production of the turbine (i.e., loading sustained/power produced). This is meant to “represent the economic cost of the main components, i.e., a function of CapEx” (ARPA-E 2019b). In Eq. (1), AEP represents annual energy production and fi is a function value driven by Eq. (2) representing the loading resisted by the components. This cost function rewards the reductions in fatigue-induced responses and ultimate loads in the turbine components, and also rewards the increases in theAEP of the turbine compared with the baseline (1) CF=AEPbaselineAEP∑i=15αifiwhere (2) fi=∑k=110nkMi,kMbaseline,i,k∑knkwhere αi = weight factor for each of the five turbine blade components (blades, rotor, nacelle, tower, and platform); and (AEPbaseline/AEP) = improvement of using an IPC controller over only using the baseline CPC controller with respect to AEP. This ratio is then multiplied by a sum of component weight (αi) times function value (fi) across the five turbine components. The αi weight values are 0.11, 0.02, 0.11, 0.11, and 0.65 for the blades, hub, nacelle, tower, and platform, respectively. Here, fi is a representation of the loading on turbine component i at 10 different frequencies, where nine frequencies correspond to fatigue loading, at component natural frequencies and at 1P (0.2 Hz), 2P (0.4 Hz), and 3P (0.6 Hz) frequencies, and one frequency corresponds to ultimate loading. These frequencies and ultimate loads are indexed by k in Eq. (2), where the frequencies are based on the component properties of the NREL 5-MW reference turbine model (Jonkman et al. 2009) provided by the ATLAS competition.Loading metrics (Mi,k and Mbaseline,i,k) were calculated for each k based on the sensor time series output signal from the simulation, and represent the fast Fourier transform (FFT) magnitude or maximum amplitude of the signal. These FFT magnitudes were calculated for each DLC, and the maximum magnitude of all 12 DLCs was used to evaluate the overall cost. Additionally, each k has an associated weight (nk) used to weigh each type of loading. The FFT frequencies for each index of k and their associated weight are available from ARPA-E (2019b).Additionally, there is a set of six constraints enforcing the upper boundary of tower/blade clearance, tower top acceleration, rotor rotation speed, and blade pitch rate (speed of blade pitch change) for each of the three blades. In case any of the constraints are violated, the cost function yields the value of 1,000, which indicates the design is infeasible. The upper boundary values are typically found in design guides, although they were relaxed from typical design values to ease the task for the MLC. In this paper, the upper boundary values provided by ATLAS competition (ARPA-E 2019b) have been adopted.Machine Learning ControlUnlike traditional control algorithms, which are typically found upon physical analysis of the system model of an engineering system, MLC offers a more powerful and diverse approach to generate an effective control law. MLC combines the optimization properties of machine learning with a closed-loop control model. MLC attempts to create a control law for systems that are large and complex enough such that generating a control law by traditional methods is not reasonable. Also, MLC founded upon EA provides interpretable results, unlike the method founded upon black-box models (e.g., ANN) that may exhibit unpredicted responses. Therefore, human engineers can interpret the resultant control law to validate its functionality and derive additional knowledge about the target system. MLC has shown success in fields such as autonomous vehicles (Tuncali et al. 2018), building thermal control (Seyedzadeh et al. 2018), and vibration reduction control (Li and Zhao 2019).This paper builds on the MLC method based on EA, specifically genetic programming (GP) (Willis et al. 1997), in order to minimize the aforementioned cost function. EA is a generic population-based metaheuristic optimization algorithm that mimics the process of natural evolution (e.g., mutation and crossover) to find the fittest individual. GP can be considered as a special implementation of EA, where each individual is a computer program that can perform a defined task (e.g., mathematical operation represented as tree-based structure). In contrast, a genetic algorithm (GA) is similar to GP, but each individual of a GA is considered as raw data.MethodsFounded upon GP-based MLC, the proposed method generates control law candidates (called individuals), evaluates cost function values to rate their performances, and selectively evolves promising individuals for the following generation. The proposed method can derive an efficient IPC controller throughout evolution, under the following assumptions: •The underlying physical models and simulations (e.g., reference 5-MW wind turbine, OpenFAST) are physically accurate.•Sensor readings are precise and noise effects are negligible. OpenFAST output does not include sensor noise, and no sensor noise was artificially added to the sensor signals as part of this experiment.•The way of interpreting the efficiency of FOWTs (e.g., the cost function value for all 12 representative DLCs) is realistic. However, the ATLAS competition environment does not incorporate some important practical design considerations, such as the balance across IPC signals or actuator failure risks.In the rest of this section, each core component of the proposed method will be highlighted.MLC Process OverviewThe main focus of this paper is to develop an effective IPC controller via MLC. Fig. 1 shows a block diagram of the proposed MLC process governing the creation, evaluation, and evolution of individuals. First, simulation information (e.g., inputs of turbine sensors and outputs of blade pitch values) is used to generate an initial population of control laws (i.e., individuals). Next, OpenFAST is used to evaluate the cost function values of the population under one of 12 DLCs, and the population is ranked/sorted. By not evaluating all the DLCs in every generation, the proposed method focuses on more important DLCs for faster evolution of the population. Then, evolution operations (i.e., duplication, mutation, and crossover) are used to create the population for the next generation, which only contains feasible individuals passing a pre-evaluation. Here, elite individuals (a group of top individuals on the recently evaluated population) and champion individuals (the number one individual in each DLC) are retained automatically for the next generation. Keeping elite/champion individuals in the population ensures that promising control laws for previously evaluated DLCs are not forgotten when their DLC is not evaluated, such that exploration and exploitation are balanced. Exploration of the 12 different DLCs is encouraged through the DLC switching, and exploitation of successful controllers is encouraged through the elite and champion individuals.After a predefined number of generations are evaluated, the best individual is determined by re-evaluating the top 1% of individuals in each generation against all DLCs. This overall cost function value is used to choose the best individual.Cost Function Value Evaluation with OpenFASTTo guide the evolution of the population, the cost function values of the population are evaluated against one DLC in each generation. Fig. 2 shows a block diagram of the cost function value evaluation process with OpenFAST (NREL 2021). The input of the process is the combination of the DLC to be evaluated and the control law (i.e., individual) to be tested, provided by the MLC process (Fig. 1). In this paper, the turbine blade individual pitch control problem involves the following components: (1) blade pitch controller to be designed, (2) pitch actuators representing the interaction between the OpenFAST environment and blade pitch controller, and (3) OpenFAST simulation environment (parallelized) representing the turbine system to be controlled.The blade pitch controller is shown as the sum of two distinct controllers. The first is the baseline CPC controller, which controls the collective pitch of the blades to keep the generator spinning at rated speed. The baseline CPC controller contributes to maintaining an optimal AEP of a wind turbine, and its parameters are fixed (i.e., does not change along the MLC process). The other controller is an MLC IPC controller, which adds positive or negative pitch adjustments for each blade. Here, the IPC adjustments start with zero to make sure the controller is feasible. MLC IPC controller mainly contributes to reducing the structural fatigue applied to the turbine components. The MLC IPC controller is gradually evolved as the control laws from the MLC process evolve.The aggregated signal from the baseline CPC and the MLC IPC controllers is sent to the pitch actuator to apply the blade pitch at the next time step, and then OpenFAST model generates time-series sensor data (e.g., structural component loads, vibrations, and mechanical/electrical systems information) as the feedback signals for the blade pitch controller. Once the simulation is complete, the cost function value is derived from the time-series sensor data to guide the MLC process.Parallelization of the ATLAS EnvironmentTypically, a MLC process evolves 500 individuals over 100 generations, which requires 50,000 cost function evaluations at minimum (Kane 2020). Considering that each session (one controller on one DLC) of OpenFAST simulation takes approximately 6 min (on a 2-GHz processor), it is time-consuming to reach the 100th generation. Therefore, the ATLAS competition environment was modified to allow parallel computations of multiple simulations on a multicore machine. The MATLAB parallel computing toolbox (Sharma and Martin 2009) was used for parallelization. To avoid the conflict between temporary files generated from parallel OpenFAST sessions, a more involved parfor loop has been implemented to replace the parsim function (parallel simulation function) provided by the MATLAB toolbox.MLC Implementation DetailsOpenMLC2 and Its ParallelizationThe MLC model is based on the OpenMLC2 V0.2.5 Framework in MATLAB (Duriez et al. 2017), which provides the framework for creating, evaluating, and evolving individuals. A MATLAB machine learning environment was used over other languages due to the ease of integrating it with the ATLAS simulation environment. OpenMLC2 creates individuals using a tree-based GP, which represents each individual (i.e., control law candidate) in a tree structure. Fig. 3 shows a simple example of a tree-based representation of control law (y=a1·c1+sin(a2−c2·tanh(a3))), where each branch represents sensor data and each node represents an algebraic/functional operator. Each individual has three control laws (one for each blade). which will be uniquely defined and evolved as the learning process continues.OpenMLC2 has its built-in parallel evaluation options. However, this parallelization process was limited only to the evaluation of the population, and steps like pre-evaluation and evolution were still performed in series. Evolution and pre-evaluation added up to hours worth of wall clock time when computed in series for each generation. This is due to needing to compile and open a new Simulink model for each controller that needs to be pre-evaluated. To address the issue, additional parallelization was implemented both in the evolution loop and in the initial population pre-evaluation loop.The MLC evolution process might generate a lot of random infeasible individuals. Therefore, individuals were filtered by pre-evaluating functionality and stability with an open-loop model. Here, the open-loop model utilizes the time-series sensor data from the closed-loop simulation of the baseline CPC controller where the first DLC is used as the input. The functionality of an individual is evaluated by looking for a threshold of enough variance (standard deviation > 1°) in the pitch signal, and stability is evaluated by asserting limits for the pitch actuator movement speed. Because an open loop model does not require OpenFAST simulations, this pre-evaluation speeds up the evolution of control laws by skipping infeasible individuals.After pre-evaluation, controllers that result in unstable signals were removed from the population and either rebuilt (in the case of the initial population), or re-evolved using a new set of random evolution parameters. Because this process requires Simulink to be compiled and run, the parallelization of the pre-evaluation process saves a significant amount of learning process time. The additional parallelization is handled using similar methods as used in the parallelization of the ATLAS model environment.DLC SelectionIdeally, all the 12 DLCs should be tested because it is not apparent which DLC will control the design. However, evaluating all 12 DLCs for the entire population in every generation will significantly slow down the learning process. To address the issue, individuals were evaluated using a single DLC of the 12 for each generation. When choosing the DLC to be evaluated, the DLC potentially most affecting the design of control law was prioritized to facilitate the evolution of the control law. This saves on computing time of the complex wind turbine model while still allowing the MLC algorithm to learn effective controller properties from each DLC over multiple generations.For the first 12 generations, DLCs 1 through 12 were tested, respectively, to establish a baseline average population cost for DLC. This average population cost consists of the average of the cost values of the top 20% of individuals.After the initial 12 generations, the proposed method chooses the DLC with the highest value for the averaged cost mentioned previously. Thus, the learning process is keeping track of which DLCs the population is doing well on and which DLCs the population is struggling with. To implement the aforementioned functionality, additional data fields were added to the OpenMLC2 framework to keep track of the aforementioned information. Additionally, a data field was added for individuals to save cost values for each DLC rather than just a single cost value. This allows individuals that were carried over to the next generation either via duplication or elitism to not have to be re-evaluated in Simulink if the same DLC was used in the following generation, saving on computing time.Champions of Each DLCIn the early tests of the MLC process, the DLC selection would often be evaluated on one DLC (say, DLC 6) for several generations only to result in an unstable response to the wind/wave profile of a new DLC (say, DLC 9). This would create a situation where in order to reduce the cost of DLC 9, the genetic program would forget the patterns it learned from DLC 6 in order to survive in the new DLC. This resulted in several instances where when DLC 6 was revisited, the controllers’ performance learning curve fluctuated from having been tested on other DLCs that had optimized processes that were not useful on DLC 6. Under the ATLAS competition environment, this was most noticeably occurring with four DLCs involving specific wind loading conditions (DLC 6: ECD, DLC 7: EWS, DLC 8: EOG, and DLC 12: Step) rather than the other eight DLCs involving more general operating conditions in Table 1.To combat this, the top individuals of each DLC were saved and given immunity from being removed from the population during evolution. These champion individuals are automatically duplicated into the following generation until another individual becomes a new champion on the same DLC. Each of the 12 DLCs will have its own separate list of champion individuals, and the list is updated each time the DLC is selected for the evaluation of a generation. These champion individuals give the memory of which control law worked well on each DLC. In other words, they can serve as parents for other generations even when their DLC is not being tested, allowing their functions to remain relevant over long gaps between generations when a specific DLC is evaluated. This facilitates the proposed method to learn a control law that works well across all the target DLCs.Internal State Feedback Option within IndividualsState feedback is a common control method used to reduce the steady-state error that proportional control tends to have. An individual in OpenMLC2 is limited to create control laws based on known sets of input signals and controller output signals, as shown in Fig. 3. OpenMLC2 does not explicitly define controller output signals as the outputs exclusively for the system actuators or for a controller internal state feedback loop. As part of this paper, individual creation was modified to create both actuator output equations (signal sent to the pitch actuators θ1 through θ3) as well as internal state feedback output equations (signals sent through an integrator and then back into the control block as an input signal X1–Xm). Thus, an individual will have an output that can be passed to both the blade pitch actuators and the internal state feedback loop. The feedback loop is designed to work with an integrator to allow the potential of a PI controller to be implemented within an individual’s control law.This modification adds a predefined number of state signals to both the input signal list and the output signal list. Unlike the actuator signals, the state outputs are sent through an integrator and then sent back to their corresponding inputs as state inputs. The integrated state signals have an equal probability of being added to an equation tree as the input signals from the model, and a state equation can use state signals as inputs as well. As a part of the pre-evaluation process, the stability of this state is checked numerically by imposing upper and lower bounds for the open-loop state output. This does not mean it directly looks for state stability, but rather makes sure the integrated internal state feedback output does not pass a magnitude threshold. Furthermore, an individual is forced to create equations for these state variables; however, it is not forced to use the integrated signal in any of the output equations. This means that state laws are allowed to evolve independently of the controller and its evaluation. The limitations of this method are discussed in the “Results” section.Because state outputs are not used in every control law, the individual identification algorithm had to be updated to properly identify duplicate individuals. OpenMLC2 identifies duplicate individuals by hashing all control laws and comparing the hashes. A constraint was introduced such that the hashing algorithm only hashes components of the control laws that are used. Thus, if a state variable input does not appear in the control law, the state equation is not hashed as part of the identification hash. This prevents individuals with the same blade pitch control laws but different unused state equations from being hashed as unique individuals, and thus the duplicate individual is removed from the population.ResultsExperimental SettingsThe MLC program was run for 100 generations using the parameters given in Table 2. OpenFAST provides 110 time-series sensor data to Simulink, and among them, 31 sensors were selected as potential inputs for the controller based on engineering judgment. These sensors were selected because they either act as direct inputs to the cost function, have the potential to give other information about the state of loads and energy production of the turbine, or are common sensors that real-world IPC or CPC controllers use. For example, OpenFAST provides moments on the blades at each quarter point but if we assume Mode shape 1 controls, we only need the moment at the root of the blade. The list of the 31 sensors is given in Table 3. Before being sent to the IPC controller, these signals are normalized to a standard deviation of 1 and zero mean. For standardization, mean and standard deviation values of each signal were derived via the benchmark CPC controller simulation under DLC 1 wind loading.Table 2. MLC simulation parameters for the experimentTable 2. MLC simulation parameters for the experimentCategoryValuePopulation size1,000Number of total generations100Number of champions per each DLC2Number of elite individuals replicated (does not include champions)8Number of input sensors31Number of actuator outputs3Number of internal state feedback inputs/outputs3Probability of duplication10%Probability of mutation40%Probability of crossover50%Available operators+, −, ×, /, sin(), cos(), ln(), and tanh()Table 3. List of sensor data for the epxeriment, provided by OpenFASTTable 3. List of sensor data for the epxeriment, provided by OpenFASTSensor name (provided by OpenFAST)DescriptionRotSpeedRotational speed of the rotor (revolutions per minute)AzimuthAzimuth angle of Blade 1 (degrees)BldPitch#Current blade pitch of Blades 1, 2, and 3 (3 total) (degrees)RootMyC#/RootMzC#Moment in the flapwise (Y) and edgewise (Z) direction at the root of each blade (6 total) (kN-m)RotTorqTorque at the axle of the rotor (low-speed shaft) (kN-m)TwrBsMytBase moment about the Y-axis (left of downwind direction) (kN-m)GenPwrPower output of the generator (W)NcIMUTAx/y/zAcceleration of the nacelle in the downwind (x), crosswind (y), and vertical (z) directions (m/s2)PtfmYaw/Pitch/RollYaw/pitch/roll of the platform (degrees)PtfmSurge/Sway/HeaveSurge, sway, or heave of the platform (m)Wind1Velx/y/zWind velocity in the downwind, crosswind, and vertical directions (m/s)NacYawYaw angle of the nacelle (degrees)T1/2/3Tension in the three mooring ties (kN)Wave1ElevElevation of the wave (m)The learning process took place on a Google compute engine cloud virtual machine with 96 virtual processors at 2.0 GHz, 360 GB of memory, and 500 GB of disk storage. The training MLC of 100 generations took roughly 10 days of total wall clock time and generated approximately 1 TB of output data.Analysis of EvolutionCost Function ConvergenceFig. 4 shows the convergence of the cost function value throughout the learning process. The stem and leaf plots show the distribution of all individuals in each generation evaluated on that generation’s DLC. The line shows the lowest cost of the top 10 individuals when re-evaluated against all 12 DLCs (there were no successful controllers in Generations 43 and 44, resulting in the gap in the line). Fig. 4 indicates that the most successful breakthroughs in learning occurred in Generations 50 through 60.Within Generations 50–60, the majority of top individuals began to adapt a Blade 3 output to be equal to the tower pitch signal (the signal is normalized and zero mean before being sent to the controller). Because the cost function weighs heavily on reduction in the platform pitch signal (αplatform=0.65), this provided the foundation for the other two individual pitch control signals for Blades 2 and 1 to find patterns that complement this signal. In addition, because pitch control signals have zero means, the input signal already acts as an error signal because the mean pitch for DLC 1 is already added to the signal by the baseline CPC controller before the controller accepts the input signal.DLC SelectionFig. 5 shows a summary of which DLC was running during each generation. Fig. 5(a) shows that DLC 6 (inflow wind type: ECD) was evaluated in more than a quarter of the generations. The gust and direction changes that occur in DLC 6 proved to be a difficult and unique challenge for the MLC to overcome and balance with the other load cases. DLC 2, 7, and 12 were only evaluated two times (once at the beginning and once in Generations 74, 68, and 77, respectively). This is due to the good performance of the initial controllers when these DLCs were evaluated in the first 12 generations (Fig. 4).Internal State Feedback Loop UtilizationFig. 6 shows the utilization of the internal state feedback loop throughout the top 200 individuals. Utilization of this internal state feedback loop would make the individual similar to a nonlinear PI controller. The internal state feedback option was most prevalent in early generations, but by Generation 40, most top individuals were not taking advantage of the internal state feedback inputs. The reasoning behind this is due to two limitations with the way the internal state feedback loop was implemented. The primary reason is that in most cases when a state variable was used, it was a constant value used in the denominator of a division function. As the integral value starts increasing with time (because it is constant), the denominator overwhelms the numerator and the tree branch converges toward zero. This effectively gave the evolutionary algorithm a way to delete branches of an existing tree when creating a new individual.Secondly, because the states were allowed to evolve independently of the controller (state equation trees could change without impacting the controller output), the controller would gain access to a state with a complex random tree of untested functions. When this happens, the controller often becomes unstable or performs poorly due to such unfamiliar functions. For future work, it is recommended that states be forced to be dependant on at least one model input signal (eliminating constant states altogether) and to generate state equations when a state variable is added to the individual and not as an unused equation stored with the individual.Input Signal UtilizationFig. 7 shows the usage of the 31 selected sensors of the top 200 individuals in the last 50 generations. By this point in the learning process, all of the top individuals had adapted to using the platform pitch signal in some way, mostly in normalized Blade pitch 3 as Blade3Pitch = PltfmPitch. Additionally, many of the controllers were also using the Nacelle Acceleration (in the downwind direction) in the equations for Blade pitches 1 and 2. Tension in Mooring line 2 was also appearing in around 40% of controllers, but the branch it was in was removed in most cases by a state variable in the denominator, like mentioned previously.As mentioned previously, because the cost function has a large weight associated with a reduction in platform pitch movements, it would make sense that the signal is widely used by top-performing controllers. Nacelle acceleration has the double benefit of being used as an input to the cost function (for nacelle loading) and also being related to the pitch of the platform (if deformations in the tower are relatively small). Both of these signals can work together to complement each other to act as an inverted pendulum controller. This behavior is discussed in more detail subsequently.Analysis of the Best IndividualResults and Cost Function ValueThe best overall individual, when re-evaluated on all 12 DLCs, was found in Generation 92/100 and showed a cost value of 0.59 compared with the baseline value of 1.0, indicating a 41% reduction in cost (Loading/AEP) over the benchmark. This individual was the second-place individual in Generation 92, making it a champion individual.Fig. 8 shows the breakdown of the cost function by turbine component and frequency for this individual, as well as the controlling DLC that had the largest FFT magnitude at each component/frequency. The best individual provides a reduction in platform and tower frequency amplitudes and ultimate movement, especially at frequencies of 0.5 Hz. The individual clearly reduces the cost function by reducing the pitch of the platform both in frequency amplitude and ultimate displacement.However, this comes with a drawback. Because the platform movement is being reduced, additional loading builds up in the tower and hub due to the increased effective stiffness of the base. This most notably affects tower vibrations in the 1.3-Hz range and other components (blades, hub, and nacelle) in the 0.6–1.1-Hz range, as shown in Fig. 8. This results in the component costs for nacelle/hub loading of 1.08 times that of the baseline.There was a negligible difference in blade loading. Tower loading was reduced by a factor of 0.80, and platform loading was reduced by a factor of 0.38. The cost function weighs a reduction in platform movement (αplatform=0.65) greater than nacelle and hub loading (αNacelle=0.11 and αHub=0.02); thus, the overall cost is still reduced due to the larger reduction in pitch movement. Thus, the increased loading in the nacelle is a valuable trade-off compared with the reduction in pitch movement according to the ATLAS cost function.The effects of this platform pitch reduction are shown in Fig. 9, which compares the baseline platform pitch signal for DLCs 1, 6, 7, 8, 10, and 12 for the best individual. Overall, this individual resulted in a significant reduction in pitch ultimate loads and platform fatigue loads. A large reduction in vibration amplitude can be seen in Load cases 6, 7, 8, and 12, which are the extreme event wind cases. Under general loading cases like DLCs 1 and 10, the IPC still yields reductions in amplitude, although they are less significant than the other DLCs.Control Law Equation Driven via MLCThe best individual behaves similarly to an inverted pendulum controller in that it tries to minimize the error between the platform pitch and the mean operating platform pitch, which results in a reduction of platform pitch magnitudes at the applicable frequencies and ultimate loads. The sign convention shown in Fig. 10 is used in the following equations: (3a) BladePitch1=SPltfmPitch+SNclMUTAx+sin(SNclMUTAx−0.136+β1)(3b) β1=cos(2×SNclMUTAx+sin(tanh(SPltfmPitch))−β2)(3c) β2=tanh(2.932×(sin(sin(tanh(SPltfmPitch)))−ST2)∫0t−9.104dt)≈0(4) BladePitch2=12.45×(SNclMUTAx+sin(cos(tanh(SPltfmPitch)−0.832)))(5) where Ssignalname = normalized input signal.The control laws for this individual are shown both mathematically and graphically in Fig. 11.The primary impact of this IPC is driven by Eq. (4), which produces pitch commands for Blade 2. The IPC component of the pitch command for Blade 2 is almost an order of magnitude more than the pitch commands of the other two blades [Eqs. (3a) and (5)] due to the constant 12.45 at the front. This generally results in an actual pitch difference of roughly double that of Blades 1 and 3 once the actuator speed and the CPC component are factored in as shown in Fig. 12. In this paper, the cost function allows an imbalance between pitch signals, which worked well in the simulation environment. However, an imbalanced controller is potentially risky in a practical scenario due to the excessive reliance on one of the three actuators. In that case, one can impose additional constraints in the population generation step to prevent imbalanced controllers from being considered.The SNclMUTAx sensor output signal is not reduced by this IPC and follows the same distribution (mean=0 and standard deviation=1) that it is initially normalized to. The sin(cos(tanh(SPltfmPitch)−0.832)) portion of Eq. (4) modifies the platform pitch signal as shown in Fig. 13. These nested trigonometric functions behave similarly to a sigmoid function. The platform pitch with the effects of the IPC tends to have a standard deviation of approximately 0.50 and a mean of between −1 and 0, depending on the average wind speed of the DLC; higher wind speed results in smaller average pitch angles, an effect of the baseline CPC. The trigonometric functions create limits of approximately −0.2 to 0.8, with normalized pitch signals within the range of −2 to 1 transition between these maximum and minimum values.These results indicate that there is a resistance to energy extraction when the normalized platform pitch is greater than −1 (this is a platform pitch angle of 2.66 when not normalized). Because the wind loading is trying to increase the platform pitch angle, this controller will increase the blade pitch to extract less energy from the wind when the normalized pitch angle passes the threshold of 2.66. This in turn means less force is being transmitted to the platform, and the pitch angle and tower base moments are reduced. This pitch adjustment is checked by the baseline CPC, which will decrease the blade pitches once the generator starts spinning below the rated speed. The remaining two IPC blade pitch equations [for Blades 1 and 3, Eqs. (3a) and (5)] behave similarly, but at smaller magnitudes and have a threshold value closer to a normalized pitch value of zero rather than −1.Unlike Eqs. (4) and (5), Eq. (3a) has a balanced reliance on both platform pitch and nacelle acceleration. Eq. (3c) involves a state input of a constant value of −9.104. After several seconds of simulation, the value of Eq. (3c) reduces to approximately zero, so Blade pitch 1 really only involves Eqs. (3a) and (3b). In Eq. (3a), if we ignore the harmonic portion (which oscillates between 1 and −1), the rest of the equation is simply SPltfmPitch+SNclMUTAx. This means that when values for SPltfmPitch and SNclMUTAx are of the same sign (say, +2 and +1), the resulting blade pitch signal will also be of that same sign (+) and of a larger value due to the addition of the sensor signals (+2)+(+1)=(+3). Thus, when both platform pitch and nacelle acceleration are in the downwind direction (i.e., both values are positive), the controller sends a large positive value for pitch change of Blade 1, resulting in less energy being taken from the wind and allowing the platform to pitch back to the mean position. IThe interpretability of MLC results allows this interpretation for a deeper understanding of the dynamics, which is not possible with the approaches based on black-box models (e.g., ANN).However when platform pitch and nacelle acceleration have opposite signs—for example, platform is pitched downwind but nacelle is accelerating upwind at approximately the same normalized magnitude—the two inputs cancel and result in an approximately zero output. This means that the individual sends a smaller change of pitch signal for Blade 1 because the acceleration signal indicates that the tower is already traveling toward the mean position.Discussion and LimitationsThe best overall individual was able to identify two methods of load reduction described previously. The IPC for Blade 2 acts as a nonlinear proportional controller [Eq. (4)] to a sigmoidlike platform pitch signal, and the IPC for Blade 1 uses a combination of platform pitch and nacelle acceleration in combination to minimize pitch and predict future changes in pitch. These two controller design strategies can be starting blocks for the design of more advanced IPC algorithms that can effectively control the complex behavior of FOWTs. The flexibility of the proposed method (i.e., it can mix and match different mathematical operations and sensor data via EA) allows the design of complex novel controllers like Eqs. (3a)–(5). Although reinforcement learning (RL) has shown effectiveness for the wind turbine control problem (Chen et al. 2020; Qin et al. 2021; Sierra-García and Santos 2020), RL methods typically have a finite set of actions not allowing the same level of flexibility to design such complex controllers. Therefore, the proposed method is more effective to the problem, where novel complex controllers are required.Also, the best individual was created by the MLC algorithm likely as a result of the selected cost function weighing heavily on reducing platform pitch movement. This heavily reduces the loading of the platform and slightly reduces loading to the tower but resulted in smaller increases in loading on the other turbine components. Additionally, the controller resulted in increased overall loading on the turbine in DLCs 5 and 9, even though the loading on the platform was still reduced significantly over the baseline.Although the best individual yields the lowest cost function value under the ATLAS competition environment, it may not be the best controller in a practical scenario. For example, many modern IPC control equations like Coleman transform–based IPC focus on reducing the 1P (once per revolution, 0.20 Hz for this 5-MW turbine at operating speed) load frequencies, which this controller does not do (Fig. 8). In this context, future work could adjust the weight factors of the cost function to focus on other turbine components to investigate potential ways to minimize loading on these other components (blades, hub, and nacelle). Other design consideration such as balancing the magnitudes of the IPC commands, accounting for sensor noise, and safety in the event of an actuator failing can also be considered in a more advanced cost function.Additionally, solutions to the limitations involving internal state feedback and integral stability should be implemented in future work. Asserting that states be dependant on at least one sensor would ensure constant state values do not slip past the stability filter. Also, state equation trees should be created when an individual calls for an internal state to be added to the equation. Such a function would prevent a large untested tree from being implemented suddenly into the control equation.Improvements could be made to the systems of DLC selection and evolution to further improve robustness and reliability. Particularly, options such as additional rounds of evaluation on the champion and elite individuals, the introduction of randomly selected load cases, and additional benefits for champions as parents in evolution could improve the robustness of the top individuals across multiple DLCs.ConclusionIn this paper, an MLC method was proposed to address complex nonlinear dynamics involving FOWTs to improve their cost-effectiveness. Specifically, the proposed method was used to efficiently evaluate control law candidates in parallel and selectively evolve promising candidates through GP with the concept of champion individuals and DLC selection. By utilizing simulated sensor data rather than an analytical model of the system, the proposed method can overcome the limitations of the existing methods, the dependence on linearization of the target system.The feasibility of the proposed method was demonstrated under the scenario provided by the ATLAS competition (ARPA-E 2019b). Specifically, the developed IPC controller has shown 41% of improvement over the baseline CPC controller in terms of the cost function (i.e., maintaining AEP while reducing fatigue and ultimate loads). In addition, the interpretability of the results allowed the derivation of knowledge about the dynamics of the target system. Specifically, it was observed that the Blade 2 pitch controller of the best individual behaves similar to a nonlinear proportional controller to a sigmoidlike platform pitch signal. This controller design can be a starting point for the design of more advanced IPC control algorithms for FOWTs.The proposed methodology has several meaningful contributions to both control of FOWTs and general MLC of complex systems with multiple load cases. First, the proposed method can provide an effective IPC controller to handle the complex nonlinear dynamics involving FOWTs. This can improve the cost-effectiveness of FOWTs and thus contribute to the broader adoption of FOWTs to utilize emission-free offshore wind energy. Because the method can develop an effective controller conforming to any complex nonlinear cost function from scratch, customizing the cost function to incorporate detailed design considerations (e.g., balancing blade pitch commands and limiting loading to sensitive components) allows the application of the proposed method in practical scenarios.Second, the proposed method is a meaningful case study of utilizing MLC to address a practical control problem based on measured or simulated data from multiple specified load cases. Implementation of DLC selection, champion individuals, and PI control were added to the existing MLC framework to accommodate the multiple DLCs that needed to be optimized. The adoption of MLC makes the proposed method adaptive to different configurations of FOWTs given an appropriate amount of data. Lastly, the proposed method can be applied to different application domains involving complex nonlinear dynamics and nonlinear cost functions, such as for UAV control.Data Availability StatementThe MATLAB code and instruction to replicate the proposed method is publicly available at https://github.com/NEU-ABLE-LAB/ATLAS_Offshore. Some or all data that support the findings of this study are available from the corresponding author upon reasonable request.AcknowledgmentsThe author thanks US ARPA-E program director Mario Garcia-Sanz and the organizers of the ATLAS competition for providing this challenge and the simulation environment. The authors also acknowledge the support of the Experiential AI Postdoc Fellowship program from Northeastern University and Roux AI Institute.References ARPA-E (Advanced Research Project Agency-Energy). 2019a. Aerodynamic turbines lighter and afloat with nautical technologies and integrated servo-control (ATLANTIS). Washington, DC: US Department of Energy. Duriez, T., S. L. Brunton, and B. R. Noack. 2017. Machine learning control-taming nonlinear dynamics and turbulence. Cham, Switzerland: Springer. IEC (International Electrotechnical Commission). 2019. Wind energy generation systems: Part 1: Design requirements. IEC 61400-1. Geneva: IEC. Jonkman, J. 2010. Definition of a floating system for phase iv of oc3. Golden, CO: National Renewable Energy Laboratory. Jonkman, J., S. Butterfield, W. Musial, and G. Scott. 2009. Definition of a 5-mw reference wind turbine for offshore system development. Golden, CO: National Renewable Energy Laboratory. Kane, M. B. 2020. “Machine learning control for floating offshore wind turbine individual blade pitch control.” In Proc., 2020 American Control Conf. (ACC), 237–241. New York: IEEE. Li, L., and X. Zhao. 2019. “Application of machine learning in optimized distribution of dampers for structural vibration control.” Earthquakes Struct. 16 (6): 679–690. https://doi.org/10.12989/eas.2019.16.6.679. Lio, W. H., B. L. Jones, Q. Lu, and J. A. Rossiter. 2017. “Fundamental performance similarities between individual pitch control strategies for wind turbines.” Int. J. Control 90 (1): 37–52. https://doi.org/10.1080/00207179.2015.1078912. Lu, Q., R. Bowyer, and B. L. Jones. 2015. “Analysis and design of coleman transform-based individual pitch controllers for wind-turbine load reduction.” Wind Energy 18 (8): 1451–1468. https://doi.org/10.1002/we.1769. Mirzaei, M., M. Soltani, N. K. Poulsen, and H. H. Niemann. 2013. “An MPC approach to individual pitch control of wind turbines using uncertain lidar measurements.” In Proc., 2013 European Control Conf. (ECC), 490–495. New York: IEEE. Plumley, C., W. Leithead, P. Jamieson, E. Bossanyi, and M. Graham. 2014. “Comparison of individual pitch and smart rotor control strategies for load reduction.” J. Phys. Conf. Ser. 524: 012054. https://doi.org/10.1088/1742-6596/524/1/012054. Qin, S., Y. Liu, Z. Liu, and M. Sun. 2021. “Data-based reinforcement learning with application to wind turbine pitch control.” In Proc., 2021 6th Int. Conf. on Power and Renewable Energy (ICPRE), 538–542. Piscataway, NJ: IEEE PES. Selvam, K., S. Kanev, J. W. van Wingerden, T. van Engelen, and M. Verhaegen. 2009. “Feedback–feedforward individual pitch control for wind turbine load reduction.” Int. J. Robust Nonlinear Control 19 (1): 72–91. https://doi.org/10.1002/rnc.1324. Seyedzadeh, S., F. P. Rahimian, I. Glesk, and M. Roper. 2018. “Machine learning for estimation of building energy consumption and performance: A review.” Visualization Eng. 6 (1): 1–20. https://doi.org/10.1186/s40327-018-0064-7. Sierra-García, J. E., and M. Santos. 2020. “Exploring reward strategies for wind turbine pitch control by reinforcement learning.” Appl. Sci. 10 (21): 7462. https://doi.org/10.3390/app10217462. Tuncali, C. E., G. Fainekos, H. Ito, and J. Kapinski. 2018. “Simulation-based adversarial test generation for autonomous vehicles with machine learning components.” In Proc., 2018 IEEE Intelligent Vehicles Symp., 1555–1562. New York: IEEE. Wheeler, L. H., and M. Garcia-Sanz. 2017. “Wind turbine collective and individual pitch control using quantitative feedback theory.” In Vol. 58271 of Proc., ASME 2017 Dynamic Systems and Control Conf., V001T25A005. New York: ASME. Willis, M., H. Hiden, P. Marenbach, B. McKay, and G. Montague. 1997. “Genetic programming: An introduction and survey of applications.” In Vol. 314–319 of Proc., 2nd Int. Conf. on Genetic Algorithms in Engineering Systems: Innovations and Applications. London, UK: Institution of Engineering and Technology.