DescriptionWe develop a communication reduced multi-time-step (CRMT) algorithm for a Lattice Boltzmann method (LBM) based on a block-structured adaptive mesh refinement (AMR). This algorithm is based on the temporal blocking method, and can improve computational efficiency by replacing a communication bottleneck with additional computation. The proposed method is implemented on an extreme scale airflow simulation code CityLBM, and its impact on the scalability is tested on GPU based supercomputers, TSUBAME and Reedbush. Thanks to the CRMT algorithm, the communication cost is reduced by ∼ 64%, and weak and strong scalings are improved up to ∼ 200 GPUs. The obtained performance indicates that real time airflow simulations for about 2km square area with the wind speed of ∼ 5m/s is feasible using 1m resolution.