Alessandra De Mulder

Up for auction: one historian’s opinion on digital tools

CSG-member Alessandra De Mulder's PhD dissertation examined eighteenth-century material culture through the lens of London auction advertisements, employing digital text analysis methods such as word embeddings to study how objects like tables and were described and valued. By combining computational linguistics with traditional historical methods, it traced shifting patterns of taste and consumption across thousands of sources. This is a slightly adapted version of the epilogue of the dissertation. The PhD defense takes place on 27 March 2026.

I am sure that some people who chose to read this epilogue might expect this to be an ode to digital humanities, much like this entire dissertation can be read as an ode to the eighteenth century. Even though you will be hard-pressed to find a bigger enthusiast of everything digital than me, this epilogue is not meant to sing digital humanities’ praises. Rather, this is meant as a critical reflection on the uses of digital methods in historical research.[1]

Let me start by stating that I do not think we are all digital historians, an opinion many have voiced before me. [2] Instead, I think we are twenty-first-century historians using computers instead of typewriters and digital inventories instead of index cards; much like historians before that stopped using quills when pens were invented and abandoned manuscript transcription when photocopiers arrived. Nor do I think we should all be digital historians, just like not everybody should be economic, social, cultural, political… historians. Or, perhaps, as Burrows and Roe suggest in their volume on the digitised eighteenth century, we will soon stop talking about ‘digital humanities’ altogether because these methods will simply be part of what humanities scholars do.[3] Or maybe this would-be craze will quietly fade to the background. I do strongly believe that variety is the spice of life, and that history needs digital and programming historians.

Apparently, this is a hot take. I have lost count of how many times I have observed others getting, or have gotten questions myself, like ‘Why can’t you just do it the ‘normal’ way?’, ‘I don’t see how these results could not have been found with traditional methods’… or just an awkward silence. In my opinion, the first question should be banned at any academic gathering that aims to foster a fruitful exchange of ideas. The second question…is not even a question but can usually be answered with: ‘No, because I could not have read tens of thousands of pages manually’. And really, that is all ‘big data’ means in this context: more pages than you could reasonably read yourself. [4] As Jo Guldi describes it, a dangerous ‘no man’s land’ has long sprawled between computational experts and humanities scholars, with few willing to traverse this territory. [5] Certainly, the first wave of historians applying statistics experienced much the same pushback. Those quantitative historians in the 1970s and 1980s were questioned about whether their results could be trusted or replicated, and clearly, advocates of digital methods are still working through similar growing pains. [6] Evidently, some evolutions take a few waves of historians to be accepted as a new reality.

More tools, more questions?

Becoming a digital or programming historian entails educating and familiarising yourself with methods, concepts and ideas that originally come from other fields, and then figuring out how to apply these to history. Not to claim that history is a ‘final boss’ or ‘mother discipline’ but looking at how history has done so with disciplines like economy, sociology and philosophy in the past, I do not see a reason why doing the same with computational linguistics should encompass such an impossible hurdle or a ridiculous idea. I will admit that all the terminology (word embeddings, vector semantics, skip-grams, and whatnot) can feel overwhelming at first. But here is the thing: the specific jargon is not the point. What matters is understanding that these are tools to help answer questions you are curious about. Curiosity-driven research remains at the heart of what we (should) do, and the methods are simply means to pursue those questions at a scale we otherwise could not manage. Many historians using digital methods will do so because they are driven by the never-ending quest to discover wie es eigentlich gewesen. In other words, trying to write history ‘as it was’; a pursuit started by Ranke in the nineteenth century and subsequently defined as the opposite of what a ‘good’ historian should aspire to write with every historiographical ‘turn’ we saw in the century thereafter. If being a good historian is the endgame, this should not be an or-or scenario but instead a cumulative process. What Guldi calls ‘hybrid knowledge’, namely a genuinely new form of insight produced where historical thinking meets computational methods, requires precisely this.[7] Rens Bod has termed this cumulative approach ‘Humanities 3.0’, which is essentially applying the hermeneutic and critical tradition to the tools and patterns obtained through digital methods. In other words, digital humanities (Humanities 2.0) can identify patterns for us, but these only become meaningful when we subject them to the traditional interpretative rigour of ‘Humanities 1.0’.[8]

The approach of this dissertation was definitely the latter, and the insights provided by linguists were invaluable, as became clear in the first section of this dissertation. This ‘digital turn’ can be seen as an approach to linguistics. This was, once again, nothing new. Previously, historians have sought to reconcile history with demography, anthropology, economics and sociology. All of these approaches were mainly quantitative-driven as well. Just like in this dissertation, when linguists looked at the most popular adjectives and classified them as either evaluative or objective. To computational linguists, I am indebted for developing relevant methods that work on older, noisy English, and for publishing data and scripts so freely. The beauty of open-source software is that it can be tinkered with and improved by anyone who spots a problem, has a better idea, or simply different needs for their research. [9] I cannot express enough gratitude for their enthusiasm and help throughout the years.

However, it turned out that said enthusiasm was so infectious that at times I (almost) lost sight of the prize, namely answering historical questions. Similarly, like applying statistics to historical data, the data should not be the end goal. Statistics and digital methods are wonderful tools, but they are just that: tools, i.e. things to use to analyse, order, group, select…your data. [10] And let us be honest about what these tools are: they might seem objective because they produce the same results each time, but choices are baked into every algorithm: from how different factors weigh through and which categories were created, to the basic process of how messy text was turned into something a computer could read. [11] There is a world of difference between ‘this is an interesting pattern’ and ‘this pattern proves my argument’. [12]

One thing digital methods genuinely excel at is putting things in perspective. When you are working with thousands of sources, you can finally tell whether what you found is actually typical or just a weird outlier. This matters more than you might think at first. For instance, I initially wanted to study tables because I kept noticing them everywhere and found the sideboard with a built-in wine cooler absolutely fascinating. However, I only really committed to this topic when I could confirm that tables were indeed ubiquitous across the sources and not just present in the handful of advertisements that caught my eye. Without that broader perspective and context that the digital text analysis brought about, I might have considered it a quirky anomaly rather than a meaningful pattern. Digital methods let us figure out what was common, what was rare, and when things started to shift. It turns close reading from ‘here is an interesting thing I found’ into ‘here is an interesting thing I found, and here is how it relates to everything else’. [13] When you have processed the data, that is when the historical work truly starts, namely, by contextualising and contrasting your findings with existing historiography.

Having said that, I do not intend to be too rigid on this front. Much like eighteenth-century natural philosophers grappling with their newfangled microscopes, air pumps, and electrical machines, we require time to get to grips with our digital reality and learn developments that we cannot yet anticipate or fully explain. Digital humanities and history need both historians who use methods to answer questions directly, and historians who build tools and experiment, keeping questions in mind but allowing space to tinker and play. [14] Both approaches are equally valuable. That being said, if you take one thing from this epilogue, let it be this: if it does not serve to answer a historical question, throw it in the bin!

There is no Great Disconnect between sources and historians where the ‘machine’ does all the thinking work for you. The danger might even lie in the opposite direction: technical constraints pushing us towards working with metadata about sources rather than the sources themselves. [15] That would be a disaster. In fact, the machine cannot do anything you do not teach it (to teach itself). If anything, it is more work reading a considerable number of sources to figure out which methods work and then prepping the data for analysis. For this dissertation, this process consisted of handling over 5,000 pages manually multiple times, especially when realising that the original chosen method was not sufficient, and that other avenues were not feasible to explore when the data was unstructured like it was.

I often struggled to find the right method, precisely because I did not (yet) have sufficient knowledge to know where to even begin to look. I was, however, incredibly fortunate to have experienced and knowledgeable people around me. Mike Kestemont and Sara Budts possess astonishing amounts of expertise and were endlessly generous with their time. Alongside their guidance, I took in-person coding courses aimed at researchers and trawled through countless YouTube tutorials. Yet even with that preparation, simply copy-pasting existing methods rarely worked: they were not designed for historians, and certainly not for the kind of questions I wanted to ask. What I ended up building was a Frankenstein collage of pre-existing tools and scripts, stitched together with a good deal of hope and a fair amount of trial and error. If I were to start this project today, I would probably still follow a similar path: seek out experts, take structured courses. However, I would also lean more heavily on AI for the coding itself, provided I understood every single line it produced.

I am glad and grateful to have done those trainings and to have interacted with such experts, because that knowledge base gave me both the confidence and the ‘competence’ to try things out. I have often called this ‘begeleid prutsen (supervised fumbling) as a method’ and have suggested this to many colleagues and students throughout the years. That foundation also made me a better user of AI, because I never ran a single line of code that I did not understand completely. Without that understanding, AI-generated code is a black box, and black boxes are the antithesis of good historical method. The learning curve was steep, frequently frustrating, and occasionally felt like an exercise in futility. But looking back, every stumble taught me something about the data, the tools, or the questions I was trying to answer. Would I recommend it to everyone? Not necessarily. But for anyone whose curiosity is piqued and whose research questions demand it, the investment is worth every hour.

Me after the umptieth approach did not yield the results I expected or The Disaster, Francis Wheatley (1747-1801)

Being a digital historian often felt reminiscent of being a medieval scribe, and that was before any historical analysis took place. [16] Yet this manual process paid off. Research in a recent article claims that the manual annotation work we performed in ‘linguistic value construction’ is (slightly) better than what GPT-4 can do. [17] This is all due to the fact that we can perceive a difference between ‘servant’s horse’ as objective (presumably not such a great horse) and ‘gentleman’s desk’ as evaluative (evocative of certain values). Guldi recounts how her students, lacking historical training, concluded that women had become ‘more and more ignorant’ over the nineteenth century when analysing parliamentary debates, rather than recognising that Parliament had become more biased against women. [18] This is precisely why historical ‘instinct’ matters: algorithms cannot perceive such distinctions on their own. This is precisely where the added value of digital methods lies, namely, when you perform educated guessing when combining techniques of the digital humanities with historical questions.

In a way, I have always seen writing history as making educated guesses since, barring the invention of time machines, we can never truly grapple what happened to whom and why. On the one hand, this is frustrating: how do you decide who is right or wrong in a debate? A single source analysed by four historians will yield at least five different opinions, and just one forgotten document can undermine an entire school of thought. On the other hand, this forms a great opportunity because anyone can provide an interesting perspective or make a groundbreaking discovery.[19] All one needs to do is train in the historical method, where, through trial and error, one learns to develop a certain, maybe slightly intangible, historical ‘instinct’. This ‘instinct’ is a skill like any other that can be honed for every little cog that makes up historical research, and it is often the result of years of accumulated knowledge and insight that cannot immediately be made explicit. An important stepping stone in this regard is reading primary sources because that is what makes history come alive. This does not disappear when doing digital history.

The pink elephant in the room: AI and the historian

One cannot discuss digital methodologies in this day and age without addressing artificial intelligence. AI is developing at a – frankly slightly alarming – rate and is widely available as never before. Whilst Generative AI is potentially well on its way to being truly able to take all possible factors into account, the true added value of trained historians might lie in subconsciously forgetting one or more factors, as well as reasoning from a more intuitive place, one that AI does not possess at this point (or hopefully ever). Allow me to elaborate.

In order to become a digital historian, we must hone our digital instincts. The assumption that all historians possess strong computer skills simply because they work with digital tools is fundamentally flawed. Technology in all its facets are changing at such a breakneck speed that any historian, digitally inclined or not, may feel like they are lagging behind. These gaps in fundamental digital literacy contribute to increased time pressure, inefficiency, and frustration in academic work. As a discipline, we must embrace the principle that ‘there are no stupid questions, only stupid answers’ and recognise that learning where to find information is as important as ever. On top of that, we must also consider whom or what to ask: whether a search engine, (Generative) AI, a close colleague, or an expert in another field. For example, I have largely stopped searching for solutions for coding issues by googling and using Stack Overflow and, instead, started to ask Claude.ai. This is not about admitting weakness, but about developing the critical skill of knowing which tool or resource is appropriate and most efficient for each task. Using AI to write something for you is miles away from using it to code more efficiently. Here is where the shoe pinches: individual historians cannot create this learning culture when their institutions remain stuck in a rut and leaving them little room to adapt quickly and be flexible when new developments occur. [20] We need systemic change, not just individual effort.

Beyond our own research, we as historians have a responsibility to model critical AI engagement in society. First and foremost, we must spread privacy awareness. The adage holds: ‘If something is (nearly) free, you are the product’. In today’s digital economy, consumers themselves have become the commodity: our data, habits, and attention are put in ‘lots’ and sold to the highest bidder for targeted advertising and behavioural prediction. We see this principle in action with dummy artists on Spotify or facial recognition in photo applications. Historians must ask themselves: what happens to our research data when we share it with GenAI platforms? Are we inadvertently feeding proprietary archives or unpublished findings into systems that may redistribute this information or use it for who-knows-what purpose? Furthermore, we cannot ignore the environmental impact of AI usage, as the computational power required for these tools comes at a considerable ecological cost.

When it comes to employing AI in historical research specifically, one core principle should guide our practice: ‘Never let AI do anything you cannot yet do (or understand) yourself’. This is not Luddism but rather a sound methodological practice. Much like we would not accept translations of sources we cannot verify or statistical analyses we cannot interpret, we should not deploy AI tools without understanding their underlying logic and limitations.

The competences we should cultivate include several key elements. First, targeted and effective searching remains a core historical skill, and this foundation is essential in the AI age. Secondly, historians must learn to recognise AI limitations, even as the boundaries continuously shift with each new model release. Thirdly, and here the eighteenth century offers us guidance, we must adopt a principle of critical verification. As David Hume argued in his Enquiry Concerning Human Understanding (1748), knowledge cannot be attained through abstract reasoning alone, but ‘arises entirely from experience’ when we observe how things consistently work in practice.[21] This principle should guide both our own research practice and function as a cornerstone of methodological rigour, directly aligned with traditional source criticism.

To make these abstract principles concrete, historians should share examples from their own research or that of colleagues, illustrating both good and bad practices. We should be even more explicit about our methods in ways we never had to be before. [22] Ten years ago, everyone understood how research in a library worked; now, with digital tools, we cannot rely on that shared understanding anymore. This reminds me of Jeffrey Ravel’s observation that the relationship between digital scholars and their audiences mirrors early modern theatre: messy, interactive, and sometimes chaotic. Eighteenth-century actors were heckled and performances interrupted. Only much later did audiences ‘learn’ to sit quietly in the dark. Perhaps we are in that unruly early phase of digital scholarship, still figuring out how to engage with each other. [23] For instance, using AI to identify errors or standardise spelling in a database is perfectly acceptable; this is tedious work where human error is likely, and the stakes are low. However, using AI to categorise sources is problematic, as this task is essential for developing source knowledge and historical understanding, i.e. honing our historical instincts. The categorisation process itself teaches us about our material, outsourcing it to AI means outsourcing the thinking and also the pleasure of learning.

Bridging worlds: the historian as translator?

The digital historian's natural habitat: permanently attempting to do the splits

I call myself a digital historian, but I am hesitant to identify as a programming historian. In comparison with the computational linguists that I encountered, I am certainly not a programmer. Even though, for many historians, it probably seems like I am. A wise woman once defined programmers as ‘people who know how to target their Google searches’; by that definition, all historians are programmers. What I have learned to be, however, is a ‘translator’ between anything digital and anything historical. This is what Guldi terms ‘hybrid knowledge’, as she applauds the ‘cyborg historian’ who seamlessly merges human insight with computational power. [24] I think the metaphor of a translator is more apt: someone who builds bridges between worlds whilst maintaining critical insights from both. I see this as the most important thing that I have learned on my digital humanities journey and consider this a crucial skill that any digital or programming historian should possess. Because ultimately, being a translator means knowing when a digital method can answer a historical question, and, most importantly, when it cannot. Make no mistake, there is plenty of historical research that absolutely does not benefit from implementing digital tools, no matter how much pressure there is to use digital buzzwords in order to secure funding. That discernment, that ability and knowledge to move and build bridges between worlds while keeping the historical question at the centre, is what distinguishes doing digital history from merely using digital tools.


Notes

[1] Since hindsight is 20/20, this was written in the autumn of 2025 and thus is based solely on the tools and developments available at the time of writing.

[2] This idea can be found more elaborately in Ian Milligan, The Transformation of Historical Research in the Digital Age (Cambridge University Press, 2022).

[3] Simon Burrows and Glenn Roe, ‘Introduction: Digitizing Enlightenment’, in Digitizing Enlightenment: Digital Humanities and the Transformation of Eighteenth-Century Studies (Liverpool University Press, 2020), preface XVII.

[4] Ian Milligan, History in the Age of Abundance?: How the Web Is Transforming Historical Research (McGill-Queen’s University Press, 2019), 56.

[5] Jo Guldi, The Dangerous Art of Text Mining: A Methodology for Digital History (Cambridge University Press, 2023), 7–10, https://doi.org/10.1017/9781009263016.

[6] Shawn Graham et al., Exploring Big Historical Data: The Historian’s Macroscope, Second edition (World Scientific, 2022), 31–32.

[7] Guldi, The Dangerous Art of Text Mining, 13–15.

[8] Rens Bod, ‘Who’s Afraid of Patterns?: The Particular versus the Universal and the Meaning of Humanities 3.0’, BMGN - Low Countries Historical Review 128, no. 4 (2013): 176, https://doi.org/10.18352/bmgn-lchr.9351.

[9] Graham et al., Exploring Big Historical Data, 39–40.

[10] For more examples on digital methods as tools, see Milligan, History in the Age of Abundance?, 59; Marnix Beyen, ‘Thick Description beyond the Digital Space’, Humanities 5, no. 1 (2016): 2, https://doi.org/10.3390/h5010002; Marnix Beyen, ‘A Higher Form of Hermeneutics? The Digital Humanities in Political Historiography’, BMGN - Low Countries Historical Review 128, no. 4 (2013): 169, https://doi.org/10.18352/bmgn-lchr.9349.

[11] Milligan, History in the Age of Abundance?, 59.

[12] Graham et al., Exploring Big Historical Data, 34.

[13] An impressive example can be found in this article: Marly Terwisscha Van Scheltinga et al., ‘(Fe)Male Voices on Stage: Finding Patterns in Lottery Rhymes of the Late Medieval and Early Modern Low Countries with and without AI’, BMGN - Low Countries Historical Review 139, no. 1 (2024): 4–28, https://doi.org/10.51769/bmgn-lchr.13872.

[14] This thought is inspired by this provoking blog post: Tom Scheinfeldt, ‘Where’s the Beef? Does Digital Humanities Have to Answer Questions?’, Found History, 12 May 2010, https://foundhistory.org/2010/05/wheres-the-beef-does-digital-humanities-have-to-answer-questions/.

[15] Milligan, History in the Age of Abundance?, 141.

[16] This sentiment was also shared by Thomas Smits and Melvin Wevers, ‘The Agency of Computer Vision Models as Optical Instruments’, Visual Communication, 2021, 342;345, https://doi.org/10.1177/1470357221992097; Graham et al., Exploring Big Historical Data, 49.

[17] Andres Karjus, ‘Machine-Assisted Quantitizing Designs: Augmenting Humanities and Social Sciences with Artificial Intelligence’, Humanities and Social Sciences Communications 12, no. 1 (2025): 1–18, https://doi.org/10.1057/s41599-025-04503-w. about De Mulder, Alessandra, Lauren Fonteyn, and Mike Kestemont. ‘Linguistic Value Construction in 18th-Century London Auction Advertisements: A Quantitative Approach’. Proceedings of the Computational Humanities Research Conference 2022 3 (2022): 92–113.

[18] Guldi, The Dangerous Art of Text Mining, 13–14.

[19] Let me caveat this: Although the discipline is no longer defined solely by the voices of elite nineteenth-century European men, historical scholarship still has considerable ground to cover in meaningfully integrating perspectives from the Global South and recognizing those communities as authoritative narrators of their own pasts.

[20] Howard Hotson, ‘Cultures of Knowledge in Transition: Early Modern Letters Online as an Experiment in Collaboration, 2009-2018’, in Digitizing Enlightenment: Digital Humanities and the Transformation of Eighteenth-Century Studies (Liverpool University Press, 2020), 130.

[21] David Hume, An Enquiry Concerning Human Understanding (1748), Section IV, Part I.

[22] Graham et al., Exploring Big Historical Data, 60.

[23] Jeffrey S. Ravel, ‘The Comédie-Française Registers Project: Questions of Audience’, in Digitizing Enlightenment: Digital Humanities and the Transformation of Eighteenth-Century Studies (Liverpool University Press, 2020), 150.

[24] Guldi, The Dangerous Art of Text Mining, 35–38.


Bibliography

Beyen, Marnix. ‘A Higher Form of Hermeneutics? The Digital Humanities in Political Historiography’. BMGN - Low Countries Historical Review 128, no. 4 (2013): 164–70. https://doi.org/10.18352/bmgn-lchr.9349.

Beyen, Marnix. ‘Thick Description beyond the Digital Space’. Humanities 5, no. 1 (2016). https://doi.org/10.3390/h5010002.

Bod, Rens. ‘Who’s Afraid of Patterns?: The Particular versus the Universal and the Meaning of Humanities 3.0’. BMGN - Low Countries Historical Review 128, no. 4 (2013): 171–80. https://doi.org/10.18352/bmgn-lchr.9351.

Graham, Shawn, Ian Milligan, Scott Weingart, and Kimberley Martin. Exploring Big Historical Data: The Historian’s Macroscope. Second edition. World Scientific, 2022.

Guldi, Jo. The Dangerous Art of Text Mining: A Methodology for Digital History. Cambridge University Press, 2023. https://doi.org/10.1017/9781009263016.

Hotson, Howard. ‘Cultures of Knowledge in Transition: Early Modern Letters Online as an Experiment in Collaboration, 2009-2018’. In Digitizing Enlightenment: Digital Humanities and the Transformation of Eighteenth-Century Studies. Liverpool University Press, 2020.

Hume, David. An Enquiry Concerning Human Understanding. 1748.

Karjus, Andres. ‘Machine-Assisted Quantitizing Designs: Augmenting Humanities and Social Sciences with Artificial Intelligence’. Humanities and Social Sciences Communications 12, no. 1 (2025): 1–18. https://doi.org/10.1057/s41599-025-04503-w.

Milligan, Ian. History in the Age of Abundance?: How the Web Is Transforming Historical Research. McGill-Queen’s University Press, 2019.

Milligan, Ian. The Transformation of Historical Research in the Digital Age. Cambridge University Press, 2022.

Ravel, Jeffrey S. ‘The Comédie-Française Registers Project: Questions of Audience’. In Digitizing Enlightenment: Digital Humanities and the Transformation of Eighteenth-Century Studies. Liverpool University Press, 2020.

Scheinfeldt, Tom. ‘Where’s the Beef? Does Digital Humanities Have to Answer Questions?’ Found History, 12 May 2010. https://foundhistory.org/2010/05/wheres-the-beef-does-digital-humanities-have-to-answer-questions/.

Smits, Thomas, and Melvin Wevers. ‘The Agency of Computer Vision Models as Optical Instruments’. Visual Communication, 2021, 329–49. https://doi.org/10.1177/1470357221992097.

Terwisscha Van Scheltinga, Marly, Sara Budts, and Jeroen Puttevils. ‘(Fe)Male Voices on Stage: Finding Patterns in Lottery Rhymes of the Late Medieval and Early Modern Low Countries with and without AI’. BMGN - Low Countries Historical Review 139, no. 1 (2024): 4–28. https://doi.org/10.51769/bmgn-lchr.13872.