CUDA架构下流域汇流D8算法并行策略和尺度效应

    Parallel strategy of the D8 algorithm under the CUDA framework and its evaluation

    • 摘要: 流向流量算法是坡面水文水动力模拟基础,CUDA架构下实现D8汇流并行算法可有效加速模拟速度,算法的并行策略便成为运算访问、冲突处理过程中的靶向研究指标。本文借助CUDA架构下原子加函数优化D8算法并行策略,选择赣江上游、上中游和全流域等不同空间尺度流域为研究区,评价各空间尺度下水系提取准确性和并行加速效果及尺度效应。结果表明: D8并行策略和经典算法下水系提取效果接近,水系长度、流域面积和河网密度相对误差小于0.3%; CUDA架构下D8并行运算耗时128时,最优加速比分别出现在网格数为1024以下和65536以上;加速比随空间尺度增加存在递减效应,赣江上中游、流域较上游相对ArcGIS加速比降幅超过20%。D8算法并行策略可为水文水动力模型并行化运算提供理论参考。

       

      Abstract: Flow direction algorithms are fundamental for slope hydrology and hydrodynamic simulations. Implementing a parallel D8 flow accumulation algorithm under the CUDA architecture can effectively accelerate simulation speeds, making the parallelization strategy a key research focus for addressing computational access and conflict handling. This paper optimizes the parallel strategy of the D8 algorithm using the atomicAdd function within the CUDA architecture. We selected watersheds at different spatial scales within the Ganjiang River basin (upper reaches, upper-middle reaches, and full basin) as study areas to assess the accuracy of extracted drainage networks, parallel acceleration performance, and scale effects at each spatial scale. The findings indicate:The drainage network extraction results using the parallel D8 strategy closely align with those from the classical algorithm, with relative errors in stream length, basin area, and drainage density all below 0.3%.Computation times under the CUDA architecture for the parallel D8 algorithm were less than those of the serial ArcGIS implementation, which were in turn less than the serial Matlab algorithm. The speedup ratio was proportional to the number of thread blocks and grids.When the number of threads was ≤128, the optimal speedup occurred with grid numbers below 1024; when thread counts exceeded 128, optimal speedup was achieved with grid numbers above 65,536.The speedup ratio exhibited a decreasing effect as spatial scale increased. Speedup ratios relative to ArcGIS decreased by over 20% for the upper-middle reaches and the full basin compared to the upper reaches.The parallelization strategy for the D8 algorithm may serve as a theoretical reference for parallel computing in hydrological and hydrodynamic models.

       

    /

    返回文章
    返回