Text Data Preparation: A Practice in R using the Sheng Xuanhuai Collection