Visualization Demonstration
The first graph is “Tenancy Condition Regarding Tenant Status”. There are several types of status in the text database:
- First, civilians including males and females, as well as privileged civilians. “Privileged” means their rent is lower than ordinary tenants, and they only have a small proportion in the civilian’s category.
- Second, local officials, including officials in local government and in the military. Zhou(Region), Jun(Commandery), and Xian(District) are three levels of local hierarchy in descending order in ancient China.
- And the third category are soldiers. From the perspective of rented land area, civilians, especially male civilians, account for the vast majority, while soldiers only account for the smallest proportion.
In the next two graphs, we continue the topic.
Figure #2 represents “per capita rented land area”. Although the total area rented by soldiers is pretty small, their average rented area is the largest, even much larger than other categories of tenants.
Figure #3 shows “total rented land area” regarding status, where the darkest color represents the largest area rented by male civilians, and so on.
Besides, privileged civilians and senior officials at all levels also rented larger average of land area than ordinary civilians and junior officials.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.font_manager as fm
# Reading Excel files
df = pd.read_excel('test111.xlsx', sheet_name='Sheet1', header=None)
# Scope of acquisition of geographical names and identity rows
location_range = 'A2:A73'
identity_range = 'B1:K1'
# Selection of data from the data box for the place name and identity rows
locations = df[0].iloc[1:73]
identities = df.iloc[0, 1:11]
# Select data range
data = df.iloc[1:73, 1:11]
# Set the column name to the identity name
data.columns = identities
# Indexing of geographic names
data.set_index(locations, inplace=True)
# Replacing non-numeric values with NaN
data = data.apply(pd.to_numeric, errors='coerce')
# Set the font to one that supports Chinese
font_path = 'Songti.ttc' # Replace the font path with one that supports Chinese
prop = fm.FontProperties(fname=font_path)
sns.set(font=prop.get_name())
# Creating Heat Maps
plt.figure(figsize=(20, 16))
sns.heatmap(data, annot=True, cmap='YlGnBu', fmt='.2f')
plt.title('Land Area(mu)')
plt.xlabel('Tenant Status')
plt.ylabel('Place Name')
plt.savefig('heatmap2.png')
plt.show()
And in Figure #4-5, we took regional information into consideration. The x-axis represents different tenant statuses, and the y-axis represents different place names. The data in the table reflects the rented land area of different tenant status in different regions, with darker colors representing larger areas. The small graph on the right presents a zoomed-in view of the upper-left corner data. So that we can clearly see the regions with the largest land area.
Then, we analyzed the rental payment time recorded in the data.
From Figure #6, we can see that the payment period starts from September and continues until March of the following year, and the majority is concentrated in November and December. That’s because in late October, most farmers finish harvesting crops and count their harvest.
According to the text in the record, if they encounter drought and have a poor harvest, the rent will be appropriately reduced, so it took some time for officials to decide reduced rent.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
# Reading Excel files and data processing
df = pd.read_excel('日期统计.xlsx', usecols="A:C", skiprows=1, nrows=102, names=["Date", "4th", "5th"])
# Creating New DataFrame
data = pd.DataFrame()
data["Date"] = df["Date"][1:102] # 提取A2至A103单元格的数据
data["4th"] = df["4th"][1:102] # 提取A2至A103单元格的"四"数据
data["5th"] = df["5th"][1:102]
# Fusion of "Fourth" and "Fifth" data columns
data = pd.melt(data, id_vars=["Date"], value_vars=["4th", "5th"], var_name="Year", value_name="Data")
# Plotting a Scatterplot
sns.scatterplot(x="Date", y="Data", hue="Year", size="Data", data=data)
# Setting the size and color mapping of points
plt.legend(title="")
plt.title("Scatterplot of records")
plt.ylabel("Number of records")
# Adjusting the rotation and position of the horizontal axis labels
plt.xticks(rotation=45, ha='right')
# Formatted horizontal scale labels, displayed in segments by month
ax = plt.gca()
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter("%b"))
plt.savefig('scatterplot.png')
# Show Chart
plt.show()
Since the original database content covers the records from the fourth year to the fifth year of Jiahe, we also create Figure #7, with data precise to date in two years plotted into the scatter plot. In terms of payment time, the situation in the two years is generally similar.
The text in the database also includes two different types of land: standard field and extra-labor field. According to existing research, a standard field means that the rent is fixed for the two years of the renting period, while the rent of the extra-labor field could be flexible and may be adjusted by the officials.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Reading Excel files and data processing
df = pd.read_excel('landType.xlsx')
# Extracting "Standard" and "Extra" data
data = df.iloc[1:73, 0:2]
sns.set_style("dark")
# Creating jointplot
g=sns.jointplot(x='Standard', y='Extra', data=data, kind='hex', cmap='YlGnBu', gridsize=40, height=6, ratio=2)
g.ax_joint.collections[0].set_cmap('gray_r')
g.ax_joint.collections[0].set_clim(0.01, g.ax_joint.collections[0].get_clim()[1])
g.set_axis_labels('Standard field area (mu)', 'Extra-labor field area (mu)')
plt.savefig('jointplot.png')
plt.show()
Figure #8 is a hexbin plot joint with bar graphs showing the relation between the standard field area and the extra labor field area. In most regions, the area of the standard field is between 0-500 mu, while some may reach over 2000 mu. For the extra labor field, the largest area could only reach 200 mu.
Figure #9 shows the proportion and area of land types by region. As we can see, the standard field accounts for the largest proportion in all regions. That means in most of the cases, the renting period and rent are regular.
Finally, the two graphs below are statistics on the area of drought land.
Figure #10 shows the overall proportion of drought land in all regions over the two years, while Figure #11 compares the proportion of drought land in some regions in two years’ time.
It can be seen that the drought was commonly encountered and was relatively severe in some regions like those in Figure #11, which is consistent with existing research that drought hit Changsha County from the fourth to the fifth year of Jiahe.
However, the severity of drought also varies in different regions and different period of time.