Analytical Framework
This project adopts a multi-layered analytical framework designed to capture different dimensions of cultural change. By combining authorship analysis, semantic topic modelling, and lexical examination of food-related language, it seeks to construct a more comprehensive account of how the magazine’s content evolved over time.
Semantic Modelling with BERTopic
Traditional topic modelling approaches such as Latent Dirichlet Allocation (LDA) rely on word frequency distributions and treat documents as collections of independent tokens. While effective for certain tasks, such models are limited in their ability to capture semantic relationships, particularly in texts that involve metaphor, mixed languages, or context-dependent meaning.
To address these limitations, this project employs BERTopic, a model that leverages sentence-transformer embeddings to represent each document as a high-dimensional semantic vector. In this framework, articles are grouped according to meaning rather than surface-level vocabulary, allowing conceptually related texts to be clustered even when they do not share identical keywords.

Dimensional Reduction with PCA
Given the high dimensionality of semantic embeddings, an additional step is required to render patterns interpretable. Principal Component Analysis (PCA) is therefore applied to reduce the 384-dimensional vector space into a two-dimensional representation. This transformation enables the visualisation of thematic distributions and makes it possible to observe how the “centre of gravity” of the corpus shifts across different historical periods.
Food as Cultural Proxy
The decision to focus on food-related vocabulary reflects a broader methodological interest in everyday language as a carrier of cultural meaning. Food references are often context-specific and historically situated, encoding information about class, global influence, social practices, and identity formation.
By isolating and analysing food-related terms within the corpus, the project introduces an alternative lens through which cultural change can be examined. This approach moves beyond conventional emphases on industry discourse and instead highlights the significance of seemingly mundane linguistic patterns.

Workflow
The analytical process follows a structured pipeline that begins with data collection and extends through multiple stages of transformation and interpretation. After obtaining the digitised corpus, the data undergoes cleaning and preprocessing, including correction of OCR errors and standardisation of metadata. Texts are then tokenised and prepared for embedding, with particular attention paid to multilingual content.
Subsequently, semantic embeddings are generated and clustered using BERTopic, followed by dimensional reduction through PCA to enable visual analysis. Finally, the results are interpreted in relation to historical and cultural contexts, bridging computational outputs with humanities-based inquiry.
