Binary Logistic Regression

The first step of our investigation on the internal data structure is to evaluate the the difference in the correlation between interest similarity and retweeting behavior under different propagation structures.

Figure 1. Comparing the conditional effect plots on the trend of similarity in two different diffusion patterns. The viral pattern (left) emphasized more on the depth of diffusion, while the broadcast pattern (right) emphasized more on the width of diffusion.

Through comparison between the plots, we discovered that the unit increase of interest similarity in the broadcast diffusion was reduced at a greater extent compared with that of the viral diffusion. This result inspired us to further explore the dynamic relationship between interest similarity and retweeting probability at different stages of the diffusion.

A decay analysis was conducted to verify the trend of decreasing interest similarity detected in the macro-level analysis. The size of the blue area represents the scale of users (or nodes) involved in retweeting.

Figure 2. A decay analysis on the attenuating strength of interest similarity as the information spreads in depth
Table 1. Contrasting the main drivers of retweeting behavior at different stages of diffusion

According to the decreasing number of nodes and the downward-sloping curve, we inferred that users who pursued highly-matched interest tend to assemble at the initial stage of diffusion, while those users excluded by interest homogeneity concentrated when depth was above 0.5.

We applied a sliding window analysis to compartmentalize the change in interest similarity at depth 1-2, 2-3, and 3-4.

Figure 3. A sliding window analysis on the variating effect of interest similarity at different depths

It came out that the logistic regression coefficient of interest similarity decreased monotonically as the information spreads in either depth or width. This indicated the weakened marginal impact of interest similarity on retweeting behavior as depth increases.

Table 2. Concluding the change in influence of interest similarity based on quantified measures

To confirm the boundary between the growth stage and plateau stage of interest similarity, we visualized the conversion effects of interest similarity in viral and broadcast structures respectively in the following section.

The data of conversion lift by interest similarity threshold showed that he two diffusion structures shared the same optimal threshold. However, the increase in conversion lift above the threshold was much higher in depth 3-4 (secondary stage) rather than in depth 1-2 (primary stage).

Figure 4. Two sets of bar charts for contrasting the conversion effect of interest similarity at different stages. Blue bars denotes for primary diffusion where social relationship matters more for retweeting behavior, while red bars denotes for secondary diffusion where individual interest for the content matters more.
Table 3. Comparing and contrasting the effectiveness of interest similarity in driving the retweeting behavior at different depths.

In other words, the effect of highly matched user interest similarities on retweeting behavior between was magnified at greater depth. A societal interpretation for this phenomenon is that interest-driven diffusion of information becomes more scarce during the secondary stage of propagation.