Data_Sheet_1_Advancing Agricultural Production With Machine Learning Analytics: Yield Determinants for California’s Almond Orchards.pdf
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Agricultural productivity is subject to various stressors, including abiotic and biotic threats, many of which are exacerbated by a changing climate, thereby affecting long-term sustainability. The productivity of tree crops such as almond orchards, is particularly complex. To understand and mitigate these threats requires a collection of multi-layer large data sets, and advanced analytics is also critical to integrate these highly heterogeneous datasets to generate insights about the key constraints on the yields at tree and field scales. Here we used a machine learning approach to investigate the determinants of almond yield variation in California’s almond orchards, based on a unique 10-year dataset of field measurements of light interception and almond yield along with meteorological data. We found that overall the maximum almond yield was highly dependent on light interception, e.g., with each one percent increase in light interception resulting in an increase of 57.9 lbs/acre in the potential yield. Light interception was highest for mature sites with higher long term mean spring incoming solar radiation (SRAD), and lowest for younger orchards when March maximum temperature was lower than 19°C. However, at any given level of light interception, actual yield often falls significantly below full yield potential, driven mostly by tree age, temperature profiles in June and winter, summer mean daily maximum vapor pressure deficit (VPDmax), and SRAD. Utilizing a full random forest model, 82% (±1%) of yield variation could be explained when using a sixfold cross validation, with a RMSE of 480 ± 9 lbs/acre. When excluding light interception from the predictors, overall orchard characteristics (such as age, location, and tree density) and inclusive meteorological variables could still explain 78% of yield variation. The model analysis also showed that warmer winter conditions often limited mature orchards from reaching maximum yield potential and summer VPDmax beyond 40 hPa significantly limited the yield. Our findings through the machine learning approach improved our understanding of the complex interaction between climate, canopy light interception, and almond nut production, and demonstrated a relatively robust predictability of almond yield. This will ultimately benefit data-driven climate adaptation and orchard nutrient management approaches.
Read the peer-reviewed publication