CUDA架构下流域汇流D8算法并行策略和尺度效应

白桦; 付哲昊; 刘址杰

doi:10.16232/j.cnki.1001-4179.2026.02.008

CUDA架构下流域汇流D8算法并行策略和尺度效应

Parallel strategy and scale effect of D8 algorithm for watershed flow accumulation under CUDA architecture

摘要

摘要: 流向流量算法是坡面水文水动力模拟的基础，CUDA架构下实现D8汇流并行算法可有效加快模拟速度，其中算法的并行策略成为运算访问、冲突处理过程中的靶向研究指标。借助CUDA架构下原子加函数优化D8算法并行策略，选择赣江上游、上中游和全流域等不同空间尺度流域为研究区，评价各空间尺度下水系提取准确性和并行加速效果及尺度效应。结果表明：D8并行策略与经典算法下水系提取效果接近，水系长度、流域面积和河网密度相对误差小于0.3%；CUDA架构下D8并行运算耗时＜ArcGIS水文分析工具＜Matlab串行算法，加速比与线程块和网格数量成正比；线程数≤128和>128时，最优加速比分别出现在网格数为1 024以下和65 536以上；加速比随空间尺度增加存在递减效应，赣江上中游、全流域较上游相对ArcGIS加速比降幅超过20%。D8算法并行策略可为水文水动力模型并行化运算提供理论参考。

Abstract: The flow direction and accumulation algorithms serve as the foundation for hydrological and hydrodynamic simulations on slopes. Implementing the D8 flow accumulation parallel algorithm under the CUDA architecture can effectively accelerate simulation speed. The parallel strategy of the algorithm has become a targeted research factor for addressing access conflicts during computation. We optimized the parallelization strategy of the D8 algorithm using the atomicadd function under the CUDA architecture and applied it to extract river networks from sub-basins at different spatial scales in the Ganjiang River Basin (including the upper, upper-middle, and entire basin). The extraction accuracy, the acceleration effect and its scale effect were assessed. It demonstrated that the stream network extraction achieved a comparable accuracy to that of the classical algorithm, with relative errors in stream length, basin area, and drainage network density all below 0.3%. Under the CUDA architecture, the computation time of the parallel D8 strategy was significantly reduced compared to both the ArcGIS serial algorithm and the Matlab serial algorithm, with the order of efficiency as: CUDA D8 parallel < ArcGIS serial < Matlab serial. Additionally, the speedup ratio was proportional to the number of thread blocks and grids. In special, when the number of thread blocks ≤ 128 and >128, the optimal speedup occurred at the number of grid below 1 024 and above 65 536 respectively. A decreasing effect in speedup was observed along with increasing spatial scales, for example the decline amplitude of the ArcGIS speedup ratio for the middle-upper and whole Ganjiang River basin exceeded 20% compared to the upper reach. The parallel strategy of the D8 algorithm can provide a theoretical reference for the parallel computing of hydrological-hydrodynamic models.

HTML全文

参考文献(37)

施引文献

资源附件(0)