Assessing Spatiotemporal differences in Shrimp ponds using Remote sensing data and Machine learning algorithms

Shrimp farming and exporting is the main income source for the southern coastal districts of the Mekong delta. Monitoring these shrimp ponds is helpful in identifying losses incurred due to natural calamities like �oods, sources of water pollution by chemicals used in shrimp farming, and changes in the area of cultivation with an increase in demand for shrimp production. Satellite imagery, which is consistent with good spatial resolution and helpful in providing frequent information with temporal imagery, is a better solution for monitoring these shrimp ponds remotely for larger spatial extent. The shrimp ponds of Cai Doi Vam Township, Ca Mau Province, Viet Nam were mapped using DMC-3 (TripleSat) and Jilin-1 high-resolution satellite imagery for years of 2019 and 2022. The 3m spatial resolution shrimp pond extent product showed an overall accuracy of 87.5% with a producer’s accuracy of 90.91% (errors of omission = 11.09%), and a user’s accuracy of 90.91% (errors of commission = 11.09%) for the shrimp pond class. It was noted that 66 ha of shrimp ponds in 2019 were observed to be dry in 2022, and 39 ha of other ponds had been converted into shrimp ponds in 2022. The continuous monitoring of shrimp ponds helps achieve sustainable aquaculture and acts as crucial input for the decision-makers for any interventions.


Introduction
Over the past 3 decades, aquaculture in Asia has produced more than 90% of the world's output and played signi cant roles in food security, poverty alleviation, employment, and overall economic development in many Asian countries. Aquaculture has been a long-standing custom in Asia, although it has only recently emerged, some four or ve decades ago, as a major food production industry [1].
Shrimp farming in Viet Nam has been growing discernibly since the government passed a resolution in 2000, allowing the conversion of less-productive rice land in coastal areas to aquaculture ponds [2]. In terms of shrimp exports, Viet Nam placed third in the world in 2019, with 13.6% of the market share, behind only India (15.7%) and Ecuador (14%) [3]. The shrimp production plan by the Ministry of Agriculture and Rural Development in Viet Nam predicted that the shrimp farming area would be 750,000 ha in 2022 and export turnover would be over USD 4 billion, up 2.56% compared to 2021. Aquaculture including shrimp ponds is a large contributor to global food security, and rural livelihoods and can also help preserve sustainable coastal environments [4][5][6][7][8].
In Viet Nam, the Mekong delta with 12% of Viet Nam's total area has 67% of water bodies, including fresh and brackish water bodies other than rivers [9]. But now in Ca Mau Province coastal aquaculture face a rapid shift with increasing production, and intensive shrimp culture resulting in poor water quality and frequent occurrence of disease outbreaks [10].
Various studies in the Mekong delta focus on sustainable aquaculture development and the combined farming system of agriculture and aquaculture [11,12]. Many sustainable issues concern water quality due to chemical usage in shrimp cultivation [13][14][15][16][17]. Certain studies are related to climate change impacts on agriculture and the aquaculture sector [18]. Despite these many studies based on statistics and farmer interview, very few EO-based monitoring systems have been developed for shrimp ponds [19], and most studies carried out have been related to changes in mangrove areas [20,21]. Monitoring aquaculture products has shifted to relying heavily on remote sensing technology due to its advantages in estimating area under cultivation and real-time monitoring of ponds [22][23][24][25][26]. Flood detection, along with wetlands and water bodies' identi cation play a major role in contributing to aquaculture mapping [27,28]. Many studies have generated and used water-based indices [27,29,30] to identify water bodies with a variety of satellite imagery ranging from moderate (MODIS) to high (Sentinel) spatial resolution.
Previously many aquaculture mapping studies were based on open-source satellite imagery, such as Landsat, and analyses based on Sentinel imagery with more spectral bands but with lower spatial resolution were also carried out [31][32][33][34], in addition to time series analysis [35]. Recently many water bodies and aquaculture mapping studies, particularly on a global scale, have been conducted using a cloud platform, especially Google Earth Engine [36][37][38][39][40][41]. Since the study area is very highly dominated by water bodies that include ponds and major and minor streams, approaches such as segmentation [42] and deep learning-based methodologies alone cannot classify the variations created by newly developing minor ponds [43][44][45]. In order to differentiate water bodies, very high spatial resolution data is needed. This study utilized very high-resolution data (~ 3 m) and adopted machine learning algorithms as well as vectorization for classifying variations in water bodies to identify shrimp ponds.
Many Asian nations have not yet fully adopted good aquaculture governance. The rapid growth in the aquaculture sector has given rise to some of the most di cult sustainability problems, including ine cient resource use, detrimental environmental effects, frequent disease outbreaks, and food safety threats, which in turn limit the sector's potential to grow sustainably in the future. Monitoring shrimp ponds using EO data could be useful in grasping a rm understanding of the current situation of the aquaculture sector to meet the increased demand for aquatic food and sustain aquaculture's muchneeded expansion.
The primary objectives of this research are thus the development of a system for the regional-level mapping of coastal pond aquaculture for the Cai Doi Vam Township, Phu Tan District, Ca Mau Province, based on high spatial resolution single-date satellite imagery and ground truth data, as well as assessment of the accuracy and analysis of aquaculture areas. Phu Tan District has a total area of 46,433 ha with all the areas being affected by the saline intrusion, which also applies to the Cai Doi Vam Subdistrict [46]. This area is highly dominated by shrimp and other ponds.

Satellite Data
The detection of even minute linear structural elements in small-scale pond aquaculture structures with a size of less than 1 hectare is only achievable with high-resolution imagery. The analysis was based on optical data gathered from a high-resolution geometric (DMC3) sensor imaging with a spatial resolution  Pan-sharpening with image data fusion and image enhancements were applied based on linear stretches, and mosaicking was carried out on georeferenced datasets. The three bands were NIR, red, and green color chosen for a composite FCC, as shown in Fig. 1. The bands and their spatial characteristics of satellite imagery and the utilization of different satellite imagery in the study are as shown in Table 1.
Single-date TripleSat and Jilin imagery, and temporal Sentinel-2 imagery are used for the study.

Ground survey data
Domestic shrimp farmers' extensive shrimp cultivation-related data including the location of shrimp ponds were collected over Phu Tan  Settlements; (f) land cover categories (for example, trees, shrubs, grasses, water bodies, and hills); and (g) Documenting the landscape using a digital camera. The purpose of this exercise was to identify different water body classes accurately during the classi cation process and also assess the accuracy of nal maps.

Wetlands and Land use mapping
The land use and land cover with the shrimp class were mapped using DMC3 data for 2019 while the same was mapped using Jilin data for 2022 with the help of unsupervised classi cation [47,48]. An accuracy assessment was performed with validation data. Spatial analysis was used to create spatial products with a greater resolution of 3m that recorded changes effectively (Fig. 2).
Primarily unsupervised classi cation was done targeting the LULC of the study area and potential shrimp ponds that can be identi ed. Satellite imagery was classi ed using ISO CLASS cluster K-means unsupervised classi cation with a convergence value of 0.99 and 20 iterations, yielding 20 classes followed by successive generalization. These classes were identi ed using visual interpretation from Google Earth imagery. There is an opportunity to observe temporal changes in the study region using Google Earth imagery. We further used Sentinel 2-based various water indices to create a water bodies' mask. NDWI, MDWI, AWEI nsh , AWEIsh, and WRI indices were computed to identify ponds and other wetlands. This is to overcome the limitation due to single-date cloud cover imagery. We set a conservative threshold beyond which we would exclude aquaculture development. This led to the exclusion of various classes, such as built-up areas and other LULC, and in rare instances, this may have also included shrimp ponds. If any gaps arise in any class, this class can be reclassi ed, and an initial classi cation map will be prepared, which will be used in secondary supervised classi cation.

Separation of streams using LULC Vectorization
Wetlands and water bodies' classes for the 2019 and 2022 cropping years were separated. Vectorization was applied to the water bodies to identify stream networks. Naturally developed stream networks and created aquaculture ponds can be distinguished from one another by the compact geometries of the streams. A water bodies' masked raster was used for supervised classi cation with training points.

Supervised classi cation of ponds to separate shrimp ponds
This study employed the Supervised Classi cation approach by using Maximum Likelihood Classi cation. The training samples for other ponds and water bodies were selected from Google Earth images and eld survey data. The total number of training samples selected was 45 for shrimp ponds and 25 for other ponds. A minimum of 70 pixels was ensured for each sample to guarantee accuracy.

Accuracy assessment
A total of 56 strati ed, randomly distributed validation samples were used to determine the accuracy of Cai Doi Vam's nal shrimp ponds map and overall accuracies [49]. The columns of an error matrix contain the ground survey data points, and the rows represent the results of the classi ed crop maps [50]. A frequently used measure is Kappa [51], representing the agreement among users and producers accuracy from reference ground survey data.

Spatial distribution of LULC
Sentinel 2-based water indices, which are helpful in overcoming mixed classi cation or missed water pixels in single imagery, are used in developing binary water masks to separate pond water from the land around it. But in Sentinel 2-based water indices, shrimp ponds do not have clear boundaries due to its spatial resolution. Indices like MNDWI and AWEI, in particular, utilize the SWIR band, which is even coarser compared to other visible bands in their calculation (Fig. 3).
After initial unsupervised classi cation for the LULC of study area, it is observed that shrimp ponds are well connected or situated near to a stream network. Ponds are very closely situated in built-up areas and bunds are mostly covered by vegetation. Other pond are structures with mangroves situated inside the ponds or abandoned ponds. Preliminary LULC maps of both years are shown in Fig. 4. There is a clear reduction in water bodies and other LULC compared to vegetation. This is due to the growth of mangroves and ora in ponds, as well as the fact that barren land and dry soil are now covered by vegetation. This decrease in water bodies suggests a decrease in the areas used for aquaculture; however, a comprehensive change detection analysis will be available after distinguishing these water bodies.
But due to common characteristics of water bodies and similar pixel values are resulted in mixed classes of shrimp ponds, other ponds and stream networks. To eliminate these water bodies, classes were separated after unsupervised classi cation using Vectorization utilizing the unique structure of stream networks.

Streams Vectorization
The waterbodies class (raster) obtained from the initial classi cation was converted into vectors. With the help of Vectorization of waters bodies, streams were identi ed using their structure as shown in (Fig. 5). These stream network identi cations are helpful in separating ponds and also in monitoring shrimp pond waste disposal methods and the quality of wastewater.

Spatial distribution of Shrimp ponds/ shponds
The shrimp pond maps were prepared using 3m resolution for the years 2019 and 2022 (Figs. 6 & 7). Shrimp Ponds, Other Ponds, Streams, and Other LULC were delineated. Targeted classes like Shrimp ponds, Other Ponds, and Streams were classi ed with better accuracy than the other classes because of the spectral resolution of the imagery, which is a tradeoff when classifying very high-resolution imagery. Built-up and other LULC were provided as a single class in the Final Maps. Other LULC classes include barren land along with embankments of ponds for the case of 2019, while in 2022 these are covered by vegetation. The stream network consists only of major water supply channels, and it does not include those micro channels contributing to individual ponds. Changes in shrimp pond structures were clearly seen within the enlarged area itself, i.e., the conversion of larger ponds into a number of smaller ponds (within the enlarged view we can observe two long ponds converted into eight smaller ponds) as well as the increased total number of shrimp ponds.
The Cai Doi Canal is home to the majority of shrimp ponds. A more thorough investigation of each class's change can be seen in the nal maps of shrimp ponds ( Fig. 6 and Fig. 7).

Accuracy assessment
The accuracy assessment of the shrimp pond classi cation for 2019 was 87% overall accurate. Targeted shrimp ponds have 90% accuracy. Dried clay soil and bunds around ponds are classi ed as built-up areas. An error matrix (Table 3) was generated for Cai Doi Vam providing producers, users, and overall accuracies. Overall Classi cation Accuracy 89.29% Table 3 shows the accuracy assessment of 2022 shrimp pond classi cation with same 56 sample points used for accuracy assessment. Overall accuracy of 89% was obtained because the cloud cover was mostly in other LULC and vegetation area was mixed with cloud cover. But the targeted shrimp ponds have more than 90% accuracy and the misclassi ed as other ponds.

Change detection in shrimp ponds
Triple sat and Jilin data helped in identifying shrimp ponds up to 90% accuracy but in order to achieve this high accuracy, the cloud cover has to be low. Change detection was carried out for these shrimp ponds for those two different years, and it was identi ed that 639 ha of shrimp ponds were left dry in 2022 for entire Phu Tan district of which 66 ha was in Cai Doi Vam sub-district itself (Fig. 8). Further, 39 ha of other ponds in 2019 were converted into shrimp ponds in 2022.
Aquaculture has tremendous potential to aid in global food security, making the prediction of pond production volumes a crucial research goal for the coming years. Quantitative data on aquaculture production at both the national and subnational levels is accessible but typically with a lack of precision. In this study, we compiled annual statistics of two years on aquaculture productivity, focusing mostly on shrimp ponds.

Discussion on monitoring shrimp ponds using RS technology
It is challenging to identify individual ponds from the mixed pixels in the 30-m or 10-m spatial resolution satellite images, and this is why this study focuses on the extraction of aquaculture ponds from high spatial resolution imagery for mapping [44,52,53]. It is challenging for conventional classi cation techniques or a single spectral index to effectively classify the aquaculture ponds because these ponds are also sort of water body split by embankments into large network of smaller ponds [54,55]. By combining spectral data, spatial features, and structural features, we were able to build a system that would let the smaller aquaculture ponds be taken out on a regional scale. Water bodies are typically extracted via spectral index construction, and there are a variety of such indices. The most well-known are NDWI, MNDWI, WRI and AWEI (including AWEI sh and AWE Insh ) [56]. Shrimp ponds mapped using single date imagery for two different time periods of 2019 and 2022 provide us with valuable information on the changes in shrimp cultivation. We observed an overall decrease in the area under shrimp ponds over these two years. The use of remote sensing technology to monitor changes in the size and the number of ponds and frequent mapping using EO data enable us to have reliable data to predict production volume in one area ahead of time. This kind of information is valuable for buyers and processors as they can adjust their exporting volume or arrange other procurement channels to cover the loss when a decrease in production is predicted. This information is also helpful in policy making towards sustainable production for future years [57]. For example, the government or international organizations can monitor whether the coastal areas have been developed for aquaculture use illegally and if so, to what extent.
We also note that change detection studies with this imagery have limitations: DMC3/Triplesat has low cloud coverage; however, it also has a little lower spatial resolution compared to Jilin. Furthre, Jilin-1's overall accuracy was 2% higher than DMC3/Triplesat because of comparatively greater spatial resolution. This high resolution enabled us to extract even smaller shrimp ponds very clearly, compared to what we can do with the Sentinel imagery.

Conclusion
As Jilin and DMC3 satellite data are optical data vulnerable to cloud cover, this hampers the availability of data during cloudy days, especially during rainy seasons. This also results in more mixed classes. Time-series imagery can help in enhancing classi cation accuracy (in classifying vegetation, and water bodies). The categorization may bene t from necessary eld-based observations (particularly on the remarks to distinguish between shrimp ponds and other ponds). There is a high possibility of mixed classes while differentiating water bodies because the area is dominated by water. Some water bodies have vegetation and water waves, causing a change in re ectance values leading to misclassi cation. Triple sat and Jilin were not helpful in built-up and settlement classi cation due to the absence of the SWIR band.