Machine Learning in Energy

Fayadhoi Ibrahima
December 6, 2015

Submitted as coursework for PH240, Stanford University, Fall 2015


Fig. 1: A wind farm in Southern California. (Source: Wikimedia Commons)

The energy business is at the center of human societies and powers the advancements in technology and overall human well being. However, with the steady increase in the global population, projected to reach almost 10 billion individuals by 2050, the energy supply has to align with the demand. [1] Consequently, decisions about and management of resources have become critical, as they can have a huge economic impact or can cause energy shortage if poorly handled.

We previously saw that these concerns lead to the consideration of uncertainty quantification in Energy. [2] Uncertainty quantification informs about the impact of unknown data on model predictions. However, accurate models of physical phenomena can be complex to form or extremely expensive to solve. Worse, these physical models are seldom known or fully understood. This incapacity of theoretically modeling the relationship between input parameters and model responses motivates the use of data (e.g. measurements, surveys) to infer this relationship and produce data-driven predictions. We refer to this alternative methodology as statistical learning, or machine learning. [3]

Why Machine Learning in Energy?

Fig. 2: Neural network of a mouse. (Source: Wikimedia Commons)

The spreading of data collectors and sensors in the utilities industry has led to enormous amount of data collected about energy consumption. Huge amount of data can help understand, model and predict physical behaviors and human impacts towards energy resources, especially when a physical model is not available or incomplete. In particular, the data mining and machine learning communities have found a great opportunity in applications of big data techniques to the energy business. This willingness of pursuing data-driven predictions in renewable energy is particularly due to the fact that energy-related data are more easily and broadly available to the public. For instance, the US government has made available hundreds of data related to energy, thus encouraging research and applications in data-driven predictions.

Machine learning in energy has proven to be a useful tool to efficiently monitor and regulate energy consumption for households. For example, smart thermometers can learn from users' habits and optimize the temperature in their homes for efficient energy consumption. [4] In this article, we will rather focus on potential industrial applications of machine learning and summarize how these techniques are promising in optimizing energy production.

Three Examples

Let us illustrate how machine learning has been recently used in the wind, solar and hydrocarbon sectors:

Fig. 3: Representation of a jth artificial neural network used in machine learning. (Source: F. Ibrahima)

Wind: The performance of a wind farm (see Fig. 1 for an example) depends on the total power received by the wind turbines. The estimation of this power and the power curve (i.e. the power response as a function of wind speed) is a non trivial task because it is nonlinear and bounded (the power curve has an "S" shape), and depends on numerous factors such as the site location and seasonalities. Using a data-driven approach, and in particular Artificial Neural Networks (ANNs), Marvuglia and Messineo built an equivalent model of a wind farm under normal operating conditions. This model can then be used for on-line monitoring of the power generation process, for power forecasts as well as for anomaly detections (if some measurements are significantly off from the predicted behavior, some attention might be needed for efficient energy recovery). [5] ANNs are a popular technique in machine learning. Roughly inspired by the spreading of information in actual neural networks (see illustration in Fig. 2), they consist in iteratively transforming the input data through nonlinear weighted sums of transfer functions and then use an activation function to make a prediction (see Fig. 3). ANNs have recently gained traction thanks to the soaring of high performance computing (and in particular parallel computing) that makes affordable the computation of their results.

Solar: Machine learning techniques have also been recently used to increase accuracy of solar forecasts in order to optimize energy production by solar power stations (see Fig. 4). Indeed, IBM, together with collaborators, have developed a Self-learning weather Model and renewable forecasting Technology (SMT). This technology uses machine learning, big data and analytics to consistently improve the accuracy of solar forecasting by over 30% compared to currently available models. The researchers use a machine learning-based model blending that takes into account multiple meteorological models. The idea is the following. Multiple physics-based numerical weather prediction models exist, including NAM (North American Mesoscale Model), Rapid Refresh (RAP) and High-Resolution Rapid Refresh (HRRR). These models have different resolutions and are more accurate for different weather situations. What is interesting is that these models have different manifestations of error dependence on the input parameters that can be analyzed using FANOVA, functional analysis of variance. Therefore, because it is too complicated to manually divide the space of input parameter, a machine learning approach is used to efficiently blend the physical models (in the study, the researchers show results using random forests, which consists in doing classification or regression by using a multitude of decision trees). [6]

Fig. 4: Limanskaya solar power station in Ukraine. (Source: Wikimedia Commons)

Oil Reservoirs: Machine learning techniques can also be used in oil production optimization under uncertainty. To optimize oil production in an oil field, operators constantly need to decide what controls to give to the wells (well pressures, liquid rates, etc.). Given a set of controls, oil production can be estimated by solving a computationally expensive physics-based model. Schematically, the net present value (NPV) is what needs to be optimized. It represents the value of recovered oil minus the cost of injected fluids over time. Besides, because of uncertainties in the subsurface, multiple possible realizations of the reservoir input parameters should be considered for a more robust prediction. All in all, the problem is to maximize the expected NPV given the reservoir model by optimizing the well controls. However, for each realization, hundreds of expensive reservoir simulations might be needed to converge to a solution. For this optimization problem, machine learning techniques can be used at two levels to reduce the computational costs. First, in order to estimate the expected NPV, only relevant realizations can be selected by applying clustering techniques (e.g. kernel k-medoids clustering). Second, to reduce the cost of a single reservoir simulation, a proxy is built by using a Support Vector Regression (SVR) technique, which is basically a linear regression in a high dimensional feature space by using a kernel trick, or by using ANNs. The use of ANNs for oil production optimization started a decade ago. [7]


Through three examples, we illustrated how machine learning is a promising tool that can help optimize energy production and consumption. On the one hand, we are now building hardware that can collect data, integrate them, make predictions, inform the users/customers and make adjustments. Indeed, participating to the internet of things, smart sensors and smart meters are on the rise. They usually lead to greater energy savings than their conventional analogues. [4] On the other hand, we are using similar techniques and concepts to optimize energy production on the industrial level. These advancements are made possible by the tremendous amount of energy-related data available nowadays. These data carry statistically significant information about how sources of energy are distributed, produced and consumed; machine learning techniques can help unveil the hidden laws of practical energy management and complement models from physics.

© Fayadhoi Ibrahima. The author grants permission to copy, distribute and display this work in unaltered form, with attribution to the author, for noncommercial purposes only. All other rights, including commercial rights, are reserved to the author.


[1] "World Population Prospects, The 2015 Revision: Key Findings and Advance Tables", United Nations, Working Paper No. EAS/P/WP.241, 2015.

[2] F. Ibrahima, "Uncertainty Quantification in Energy", Physics 240, Stanford University, Fall 2015.

[3] G. James et al., An Introduction to Statistical Learning (Springer, 2013).

[4] J. Xu, "Energy Savings Enabled By Smart Devices", Physics 240, Stanford University, Fall 2015.

[5] A. Marvuglia and A. Messineo, "Monitoring of Wind Farms' Power Curves Using Machine Learning Techniques," Appl. Energy 98, 574 (2012).

[6] J. Zhang et al., "Baseline and Target Values for Regional and Point PV Power Forecast: Toward Improved Solar Forecasting", Sol. Energy, 122, 804 (2015).

[7] G. Zangl, T. Graf and A. Al-Kinani, "Proxy Modeling in Production Optimization", One Petro SPE-100131, 12 Jun 06