A Data-driven Approach for Mapping Grasslands at a Regional Scale
The goal of this research was to use a data-driven approach to develop a regional scale grassland mapping protocol with the following objectives. First, identify and characterize the spatial distribution of grassland types and land use across Kansas as well as the static or dynamic nature of grasslands over time using multi-year U.S. Department of Agriculture (USDA) Farm Service Agency (FSA) 578 data. Second, evaluate the spectral separability of four hierarchies of grassland types and land use using FSA 578 data, multi-seasonal Landsat 8 spectral bands, Landsat 8 Normalized Difference Vegetation Index (NDVI) data, and Moderate Resolution Imaging Spectrometer (MODIS) NDVI time series. Third, determine the optimal data combination, and the appropriate thematic resolution, for mapping grassland type by evaluating the modeling performance of the Random Forest (RF) classifier. A county-level analysis of the multi-year FSA 578 data found that the data were not all-inclusive of total grasslands across Kansas, but were sufficient to illustrate regional trends in grassland type, land use, and field size. Eastern Kansas was found to be more diverse in grassland type, more variable in land use, and contained a high number of smaller fields. Conversely, western Kansas consisted of larger fields that were primarily grazed native grasslands and land enrolled in the Conservation Reserve Program (CRP). These results indicate a more complex grassland landscape to map in eastern Kansas, while also providing guidance for training sample distributions for image classification. Jeffries-Matusita (JM) distance statistics were calculated for three-date multispectral Landsat 8, three-date Landsat 8 NDVI, and 23-period, 16-day composite Terra MODIS NDVI time series. The results indicate that combining the three datasets maximized the spectral separability of grassland types across all four grassland-type hierarchies. A comparison of the three datasets showed that multispectral Landsat 8 data had the highest JM distance statistics (which indicates the most separability). JM distance statistics calculated by-band and by-period consistently showed that information from spring and fall was more important than summer for separating grassland types. The results showed lower separability for land-use classes within a grassland type versus between grassland types. The spectral separability of pairwise comparisons incorporating land use between grassland types varied, indicating that land use does affect spectral separability in some instances. On the other hand, JM distance statistics did not substantially drop when more refined grassland types were aggregated to coarser grassland type classes (e.g. Level-1: cool- and warm-season), indicating that land use does not negatively affect the spectral separability of functional grassland types. The results indicate low spectral separability between brome and fescue but moderate to high separability between native and CRP, suggesting the use of a Level-1 or Level-2 thematic classification scheme for the study area. Finally, random forest models were constructed and evaluated using 2015 FSA 578 data and four datasets of remotely sensed data in two adjacent Landsat scenes (path/rows). Models were created for each of the four grassland hierarchies. The results showed that out-of-bag (OOB) error increased with grassland hierarchy complexity (the number of thematic classes) and OOB error was lowest for the combined remotely sensed dataset. Mapping CRP as a separate grassland type resulted in low producer’s accuracy levels, with CRP largely mapped as warm-season grasslands, suggesting the Level-1 classification scheme was appropriate for regional mapping of grassland types. Path/rows 27/33 and 28/33 had OOB overall accuracy levels of 87% and 92%, respectively. User’s and producer’s accuracy levels indicate that cool-season grasslands were mapped more accurately in path/row 27/33 where that class is more dominant than in 28/33. Using test data (withheld verification data) unexpectedly increased overall accuracy levels by 4% and 6% over OOB accuracies, which may have resulted from varying data proportions between OOB and test data, suggesting the need for further evaluation.