SCIENTIA SINICA Informationis, Volume 51 , Issue 9 : 1490(2021) https://doi.org/10.1360/SSI-2020-0303

## A novel method to identify influential stations based on dynamic passenger flows

• AcceptedJan 14, 2021
• PublishedSep 17, 2021
Share
Rating

### References

[1] Phang S Y. Urban rail transit PPPs: Survey and risk assessment of recent strategies. Transp Policy, 2007, 14: 214-231 CrossRef Google Scholar

[2] Smyth R, Mishra V, Qian X. The Environment and Well-Being in Urban China. Ecol Economics, 2008, 68: 547-555 CrossRef Google Scholar

[3] de Jong M, Mu R, Stead D. Introducing public-private partnerships for metropolitan subways in China: what is the evidence?. J Transp Geography, 2010, 18: 301-313 CrossRef Google Scholar

[4] Du Z, Tang J, Qi Y. Identifying critical nodes in metro network considering topological potential: A case study in Shenzhen city-China. Physica A-Statistical Mech its Appl, 2020, 539: 122926 CrossRef ADS Google Scholar

[5] Sun L, Huang Y, Chen Y. Vulnerability assessment of urban rail transit based on multi-static weighted method in Beijing, China. Transpation Res Part A-Policy Practice, 2018, 108: 12-24 CrossRef Google Scholar

[6] Wang J, Kong X, Rahim A. IS2Fun: Identification of Subway Station Functions Using Massive Urban Data. IEEE Access, 2017, 5: 27103-27113 CrossRef Google Scholar

[7] Yang Y, Liu Y, Zhou M. Robustness assessment of urban rail transit based on complex network theory: A case study of the Beijing Subway. Saf Sci, 2015, 79: 149-162 CrossRef Google Scholar

[8] Liu Y, Tan Y. Complexity Modeling and Stability Analysis of Urban Subway Network: Wuhan City Case Study. Procedia - Social Behaval Sci, 2013, 96: 1611-1621 CrossRef Google Scholar

[9] Wu X, Dong H, Tse C K. Analysis of metro network performance from a complex network perspective. Physica A-Statistical Mech its Appl, 2018, 492: 553-563 CrossRef ADS Google Scholar

[10] Zhao L, Li H, Li M. Location selection of intra-city distribution hubs in the metro-integrated logistics system. Tunnelling Underground Space Tech, 2018, 80: 246-256 CrossRef Google Scholar

[11] Xia F, Wang J, Kong X. Ranking Station Importance With Human Mobility Patterns Using Subway Network Datasets. IEEE Trans Intell Transp Syst, 2020, 21: 2840-2852 CrossRef Google Scholar

[12] Li X, Guo J, Gao C. Network-based transportation system analysis: A case study in a mountain city. Chaos Solitons Fractals, 2018, 107: 256-265 CrossRef ADS Google Scholar

[13] Zhong C, Manley E, Müller Arisona S. Measuring variability of mobility patterns from multiday smart-card data. J Comput Sci, 2015, 9: 125-130 CrossRef Google Scholar

[14] Sienkiewicz J, Ho?yst J A. Statistical analysis of 22 public transport networks in Poland. Phys Rev E, 2005, 72: 046127 CrossRef ADS arXiv Google Scholar

[15] Xu M, Wu J, Liu M. Discovery of Critical Nodes in Road Networks Through Mining From Vehicle Trajectories. IEEE Trans Intell Transp Syst, 2019, 20: 583-593 CrossRef Google Scholar

[16] Li X, Zhou M, Wu X. A novel method to identify multiple influential nodes in complex networks. Sci Sin-Inf, 2019, 49: 1333-1342 CrossRef Google Scholar

[17] Ren X L, L L Y. 网络重要节点排序方法综述. Chin Sci Bull, 2014, 59: 1175-1197 CrossRef Google Scholar

[18] Gao C, Su Z, Liu J. Even central users do not always drive information diffusion. Commun ACM, 2019, 62: 61-67 CrossRef Google Scholar

[19] Newman M E J. A measure of betweenness centrality based on random walks. Social Networks, 2005, 27: 39-54 CrossRef Google Scholar

[20] Freeman L C. Centrality in social networks conceptual clarification. Social Networks, 1978, 1: 215-239 CrossRef Google Scholar

[21] Gao C, Fan Y, Jiang S H. Dynamic robustness analysis of a two-layer rail transit network model. IEEE Trans Intell Transp Syst, 2021, CrossRef Google Scholar

[22] Du Y, Gao C, Hu Y. A new method of identifying influential nodes in complex networks based on TOPSIS. Physica A-Statistical Mech its Appl, 2014, 399: 57-69 CrossRef ADS Google Scholar

[23] Guia?u S. Weighted entropy. Rep Math Phys, 1971, 2: 165-179 CrossRef Google Scholar

[24] Kullback S, Leibler R A. On Information and Sufficiency. Ann Math Statist, 1951, 22: 79-86 CrossRef Google Scholar

[25] Fei L, Deng Y. A new method to identify influential nodes based on relative entropy. Chaos Solitons Fractals, 2017, 104: 257-267 CrossRef ADS Google Scholar

[26] Zhao X, Liu F, Wang J. Evaluating Influential Nodes in Social Networks by Local Centrality with a Coefficient. IJGI, 2017, 6: 35 CrossRef ADS Google Scholar

[27] Identifying Influential Nodes in Large-Scale Directed Networks: The Role of Clustering. PLoS ONE, 2013, 8: e77455 CrossRef ADS Google Scholar

[28] Wang X F, Xu J. Cascading failures in coupled map lattices. Phys Rev E, 2004, 70: 056113 CrossRef ADS Google Scholar

[29] Zhang Z H, Song Y, Xia L. A Novel Load Capacity Model with a Tunable Proportion of Load Redistribution against Cascading Failures. Security Communication Networks, 2018, 2018(6): 1-7 CrossRef Google Scholar

[30] Fan Y, Zhang F, Jiang S. Dynamic Robustness Analysis for Subway Network With Spatiotemporal Characteristic of Passenger Flow. IEEE Access, 2020, 8: 45544-45555 CrossRef Google Scholar

[31] Crucitti P, Latora V, Marchiori M. Model for cascading failures in complex networks. Phys Rev E, 2004, 69: 045104 CrossRef ADS arXiv Google Scholar

[32] Wang X F, Li X, Chen G R. Network Science: An Introduction. Beijing: Tsinghua University Press, 2006. 270--275. Google Scholar

[33] Ruan Y R, Lao S Y, Wang J D, et al. Node importance measurement based on neighborhood similarity in complex network. Acta Phys Sin, 2017, 66: 038902. Google Scholar

[34] Bonett D G, Wright T A. Sample size requirements for estimating pearson, kendall and spearman correlations. Psychometrika, 2000, 65: 23-28 CrossRef Google Scholar

[35] Chen D, Lü L, Shang M S. Identifying influential nodes in complex networks. Physica A-Statistical Mech its Appl, 2012, 391: 1777-1787 CrossRef ADS Google Scholar

[36] Kendall M G. A NEW MEASURE OF RANK CORRELATION. Biometrika, 1938, 30: 81-93 CrossRef Google Scholar

[37] Kumar S, Panda B S. Identifying influential nodes in Social Networks: Neighborhood Coreness based voting approach. Physica A-Statistical Mech its Appl, 2020, 553: 124215 CrossRef ADS Google Scholar

[38] Uncovering Spatiotemporal Characteristics of Human Online Behaviors during Extreme Events. PLoS ONE, 2015, 10: e0138673 CrossRef ADS Google Scholar

[39] Lv Z, Zhao N, Xiong F. A novel measure of identifying influential nodes in complex networks. Physica A-Statistical Mech its Appl, 2019, 523: 488-497 CrossRef ADS Google Scholar

• Figure 1

(Color online) (a) The evolution process of NLN. NLN is an abstraction of a subway network, whose structure depends on the topology of the subway network. The load of nodes comes from the statistical analysis of passenger travel data. (b) The schematic diagram of changes in the load of nodes, representing the time-varying changes of passenger flows in stations

• Figure 2

(Color online) The evaluation of the critical node based on the TFC criterion. First, based on the network topology in (a), we obtain three values of centrality criteria for nodes in (b). And then the entropy weighting method is used to determine the weight of each centrality criterion in (c). The effect of the network topology on each node is calculated by (2), displayed in (d). Simultaneously, based on the passenger flows of stations in the empirical data as shown in (e), the load of nodes in (f) is initialized, and its influence on each node can be calculated by (3) in (g). Finally, the critical node is determined by (1) combining the network topology and the load of nodes in (h). This example shows that the TFC criterion has the capacity of identifying dynamic changes of the influential station

• Figure 3

(Color online) (a) Pearson correlation coefficients for any two days in a week. The horizontal and vertical axes indicate time (i.e., 1 stands for Monday). The color depicts the value of the correlation coefficient. The darker the color, the greater the correlation coefficient. There is a conclusion that any two workdays or two days on the weekend are very similar. However, one workday and one day on the weekend own a dramatic difference. (b) The passenger flows of stations at different time. The results show that most stations will reach the maximum passenger flow at $t_8$ or $t_{18}$. Moreover, the passenger flows differ at the same time in different stations. Therefore, the passenger flows have spatiotemporal characteristics

• Figure 4

(Color online) The network average efficiency ($E$) and the relative size of giant component ($S$) corresponding to various importance criteria during different time periods (i.e., $t_8$, $t_{12}$ and $t_{18}$). $p$ represents the proportion of initial failure nodes. The results illustrate that the TFC criterion has a significant downward trend, which can effectively identify the influential station. The reason is that the TFC criterion can examine the station's importance from the perspective of passenger flows, helpful in finding important stations which are difficult to be identified by the network topology. (a) $E$ during $t_8$; (b) $E$ during $t_{12}$; (c) $E$ during $t_{18}$; (d) $S$ during $t_8$; (e) $S$ during $t_{12}$; (f) $S$ during $t_{18}$

• Figure 5

(Color online) The comparison of various importance criteria regarding the LPF during (a) $t_{8}$, (b) $t_{12}$ and (c) $t_{18}$. $p$ represents the proportion of initial failure nodes. The figure indicates that the failures are more serious during $t_8$ or $t_{18}$ than $t_{12}$, revealing the dynamic change characteristic of the influential station. Simultaneously, the TFC criterion has achieved better performance than the compared criteria, because it is the combination of the network topology and the load of nodes

• Figure 6

(Color online) The comparison results of various importance criteria in terms of (a) the average network efficiency $(E)$, (b) the relative size of giant component $(S)$, and (c) the LPF from the perspective of global time. The importance of stations in the TFC criterion is determined by the average values of the importance of each time period. The results show that the TFC criterion still achieved better performance because it combines the network topology and the passenger flows

• Figure 7

(Color online) The Kendall's correlation coefficient ($\tau$) between TFC criterion and $E$ during (a) $t_8$, (b) $t_{12}$ and (c) $t_{18}$, and the $\tau$ between $E$ and (d) SLC, (e) WMI and (f) TRE. A point in the figure represents a station, and $x$-axes and $y$-axes stand for the importance criterion and the network average efficiency, respectively. The results illustrate that the TFC criterion can effectively identify the influential station owing to the consideration of passenger flows in stations, and its correlation with the network average efficiency shows dynamic changes, which concludes that the TFC criterion can reveal the dynamic change characteristic of the influential station

• Figure 8

(Color online) The Kendall's correlation coefficient ($\tau$) between LPF and four kinds of importance criteria during time periods (a)$\sim$(d) $t_8$, (e)$\sim$(h) $t_{12}$, and (i)$\sim$(l) $t_{18}$. A point in the figure means a station, and x-axes and y-axes refer to the importance criterion and the LPF, respectively. It can be seen from the figure that the TFC criterion can achieve the best results during all the three time periods. This is because the criterion takes into account the dual effects of the network topology and the load of nodes

• Figure 9

(Color online) The evolution of station ranking (rank) with time period (time) is revealed by the intersection of the top 20 stations in each period of the subway network of Shanghai. The results show that the station's importance will change significantly during the periods of violent passenger flow fluctuations (e.g., $t_8$ and $t_{18}$ on workdays). (a) Workdays; (b) weekend

• Figure 10

(Color online) (a) The absolute value of the increment of the correlation $|\Delta~{E}|$ between the TFC criterion based on different parameters $(\theta)$ and the average network efficiency $(E)$. (b) The correlation between the LPF based on different parameters $(\beta)$ and various importance criteria

• Table 1   Data samples stored in the intelligent transportation card of Shanghai subway
 Record User ID Date In_Time In_Station ID Out_Time Out_Station ID 1 100018830 2015–4–13 8:19 $v_{112}$ 8:52 $v_{123}$ 2 100021809 2015–4–14 17:24 $v_{45}$ 17:57 $v_{36}$ 3 100026571 2015–4–15 13:02 $v_{252}$ 13:56 $v_{240}$
• Table 2   Kendall's correlation coefficient ($\tau$) between various importance criteria and $S$
 TFC ($t_8$) TFC ($t_{12}$) TFC ($t_{18}$) SLC [35] WMI [7] TRE [25] $-0.29$ $-0.30$ $-0.29$ 0.18 $-0.28$ $-0.24$
• Table 3   The top 5 stations obtained by the TFC criterion based on average passenger flows
 Station Rank TFC DC BC CC $v_{13}$ (People's Square) 1 1.780 0.021 0.157 0.105 $v_{16}$ (Shanghai Railway Station) 2 1.682 0.013 0.239 0.102 $v_{8}$ (Xujiahui) 3 1.555 0.208 0.186 0.098 $v_{88}$ (Century Avenue) 4 1.291 0.028 0.232 0.093 $v_{39}$ (West Nanjing Road) 5 1.287 0.139 0.211 0.106
• Table 4   The ranking of the top 13 stations obtained by the TFC criterion when using other importance indicators
 Station TFC ($t_8$) SLC [35] WMI [7] TRE [25] $v_{13}$ (People's Square) 1 7 5 3 $v_{16}$ (Shanghai Railway Station) 2 37 2 4 $v_{8}$ (Xujiahui) 3 1 3 2 $v_{38}$ (Jing'an Temple) 4 13 17 13 $v_{88}$ (Century Avenue) 5 17 1 1 $v_{11}$ (South Shaanxi Road) 6 3 8 6 $v_{39}$ (West Nanjing Road) 7 2 4 5 $v_{37}$ (Jiangsu Road) 8 11 7 7 $v_{40}$ (East Nanjing Road) 9 16 18 14 $v_{64}$ (Caoyang Road) 10 24 6 8 $v_{60}$ (Yishan Road) 11 5 13 15 $v_{65}$ (Zhenping Road) 12 43 11 10 $v_{41}$ (Lujiazui) 13 72 49 40

Citations

Altmetric