Session Summaries by Altan-Ricci

written by: — October 11, 2024

Data, Metadata and Tropy, 25-09-2024

This course explored how historical practices have evolved in the digital era. The beginning of the course differentiated between “digital history”, which involves leveraging new technologies, and “history in the digital age”, where historians work with digitized sources. A key focus of the course was on the crucial role of data management in the process of historical research. Initially rooted in mathematics through Euclid (4th c. A.D.), the concept of “data” broadened in the 18th century. Today, “data” refers to interpreted elements actively shaped by researchers. This digital revolution has led to an abundance of different types of data that were once scarce. The course emphasized that digital transformation affects every stage of historical research, from source discovery and analysis to interpretation, dissemination, and preservation. In this context, “data” becomes “research data”. Historians engage with diverse sources, ranging from texts to physical artifacts, which are structured through “metadata”. Central to the course, “metadata”, defined simply as “data about data”, is essential for organizing and interpreting historical materials. We examined various examples of metadata, from digital heritage collections to social media posts and online photo archives, highlighting its significance for historians. We also learned about the role of data repositories in storing and preserving different types of data, ensuring future accessibility for research. In the final section of the course, we were introduced to “Tropy”, a software tool designed for managing and archiving digitized historical sources. It allows historians to explore innovative ways of organizing and interpreting digital materials, offering valuable support in their research endeavors. 255 words

I personally find it challenging to shift from traditional historical methods to digital approaches. The course feels quite technical and theoretical to me. Reviewing the slides at home helped clarify some concepts, but not all, which left me feeling like I was attending a Computer Science class rather than a history course. Additionally, the pace of the lectures was too fast, and I struggled to fully grasp the professor’s explanations.


title: Session Summaries by Altan-Ricci abstract: Summary-2 authors:

  • Ricci-0210959012 date: 2024-10-11 —

Web Archives, 02-10-2024

This course introduced to us web archives. The course emphasized interaction, as it was divided into the following two parts: The first part involved forming groups of three or four people, and these groups had to choose one of the seven topics proposed by the professors. The second part was about presenting and discussing the subjects with the class. The first topic focused on the importance of digital archives and the preservation of websites, highlighting the Internet Archive. The second topic, titles “Luxembourg Web Archive”, which was the subject chosen by Kenan Korac and me, focused on the National Library of Luxembourg, where we explored the missions, policies, organization, limitations, and collaborations regarding the digital heritage preservation of the BNL. The next topic, titles “Archiving.lu”, involved students exploring the evolution of the “Luxembourg.lu” website using the “Wayback Machine”, a digital archive of websites from the Internet Archive. The fourth topic, titles “Collecting Social Media (YouTube) Data”, emphasized how to use YouTube comments as data for research, and students also explored their individual digital footprints. The fifth topic was dedicated to the fluidity of the web and focused on various digital technologies. The next topic revisited individual footprints on the web and how they are archived, such as linking an individual to a sport’s club’s website. Finally, the last topic centered on the digital archive of September 11, 2001, where we discussed the historical value of photos on this specific site and the associated risks. 244 words

I personally found the course very interactive, and much better than the first one, because it was much less theoretical. The fact that we could read and discuss different kinds of web archives in groups and then have a class discussion made it easier to understand the importance of such archives.


title: Session Summaries by Altan-Ricci abstract: Summary-3 authors:

  • Ricci-0210959012 date: 2024-10-13 —

Impresso, 09-10-2024

After exploring the presentation of this course, I gathered that Impresso enables historians to explore and analyze collections of digitized historical newspapers from Luxembourg and Switzerland (1739-2018) using natural language processing tools and interactive visualizations. It simplifies content searches with semantic filters while ensuring transparency regarding data quality and processing methods. Additionally, Impresso utilized various filters and detects OCR errors, enhancing the accuracy of searches for historians and making the research processes more precise. Since I didn’t receive a response from the Impresso team, I had to conduct my own experiment using Chiara Marcucci’s account. I experimented with the “Ngrams” tool, where I used the names of my city, “Petange” in French and “Petingen” in German. The word, or token in NLP, appears 21.083 times in French and 40.897 times in German, across a total of 39.413 articles. What stood out in my experiment was that both words appeared only 168 times in Swiss articles, meaning they were predominantly found in Luxemburgish newspapers. Furthermore, using the “Inspect & Compare” function, I discovered that both words appeared most frequently in the “Luxemburger Wort”, with the French version peaking in 1936 (9.34 ppm) and the German version in 1941 (21.54 ppm), possibly due to Nazi occupation of Luxembourg during World War II. Moreover, the common results were found most often in the “Escher Tageblatt” with 42 occurrences. In this way, Impresso allows for the exploration of a substantial corpus of digitized articles, making historical research more organized and significantly improving data management. 250 words

Personally, I found that despite missing the class, exploring and experimenting with Impresso at home helped me understand the functionality of the interface and its importance for historians working with Swiss and Luxemburgish newspapers. It is indeed a practical tool that helps better organize research and find relevant sources for such studies.


title: Session Summaries by Altan-Ricci abstract: Summary-4 authors:

  • Ricci-0210959012 date: 2024-10-16 —

Maps, 16-10-2024

After a brief introduction covering the key elements of maps, this course primarily focused on group work using “StoryMaps”. This interactive tool enables storytelling through maps, and our group used it to create a story map based on John Snow’s famous map of the 1854 cholera outbreak in London. Our experience with the tool was very positive, and we quickly found an option that allowed us to present the unfolding of the story by adding different “layers” to Snow’s original map, which depicts the Soho district. Starting with the base, we added various colored data layers, such as the death toll addresses, the locations of water pumps, and finally, the spatial mean and standard distance. Alongside each new layer, we included bullet-point information that addresses the questions of the exercise. The story map proved to be highly effective in illustrating Snow’s method for demonstrating that cholera was spread through contaminated water, and that the disease’s “hotspot” was located near the Broadwick Street water pump. Furthermore, by looking at the story map on the Olympic Marathon from 1896 to 2020, we noticed some similarities with our own project. Here too, the story is told through maps, showing the marathon locations over time, along with the runner’s route and the time taken to complete the race. It’s an interesting way to trace the runner’s journey geographically. However, as the group that worked on it rightly pointed out, this type of story map carries certain risks, particularly the potential for anachronism. Thus, we learned that maps could serve as valuable historical sources and be useful in multidisciplinary fields like epidemiology, but they must always be viewed critically. 274 words

Personally, I found this course and the exercise both engaging and insightful. While I already understood the importance of maps as historical sources, I learned that they can be applied in many other contexts as well. Moreover, working with the “StoryMaps” tool was enjoyable and proved to be a diverse way to present a historical event.


title: Session Summaries by Altan-Ricci abstract: Summary-5 authors:

  • Ricci-0210959012 date: 2024-10-26 —

Networks and Palladio, 23-10-2024

This course was dedicated to the importance of networks in historical research. To introduce us, as networks are flexible, we used the example of a wedding guest list, where we encountered our first technical terms such as “Reciprocity”, “Ego-Network”, and “Network Boundaries”. By visualizing the table arrangements based on “Attributes” like age or relationship status, and through examples of conversations, we realized that there can be numerous, or even very few, connections within a table. We learned the significant role of a “Broker”, who can pass information from one table to another, thereby creating a link between the two. To continue the course, we saw map examples that illustrate networks, such as Facebook friends, where we observed that China and Russia have their own networks, and in parts of Africa where there are no people or internet, these areas appear as blank spaces. Hence, maps can be, once again, interesting but can also be overwhelming at times, which is why data visualization is valuable. We then looked at historical research examples, such as the rise of the Medici family to power, Gestapo interrogations in Cologne, and the Jewish support network during World War II. We saw that data visualization allows historians to uncover connections they may have overlooked or not noticed, pushing them to explore individuals further and recognize their significance, as Prof. Düring did in his PhD with Walter Heymann. To conclude, we were able to create our own network on Palladio, a digital tool for visualizing and analyzing data using various instruments Kenan and I focused on the relationships forged in New Caledonia during our mobility semester. 269 words

Personally, I found this course very entertaining and well-structured. Moving from basic examples to maps and then to data visualizations in historical research was enjoyable and easy to understand. Even though there is still a vast field to explore, using Palladio was also enjoyable, and the course was highly engaging and interesting.


title: Session Summaries by Altan-Ricci abstract: Summary-6 authors:

  • Ricci-0210959012 date: 2024-11-01 —

EU Parliament Archives, 30-10-2024

This course focused on a lecture presented by Ludovic Delepine and Marco Amabilino about the digital archives of the European Parliament and the integration of AI in research. Their primary mission is to democratize access to millions of documents, including legislation, resolutions, positions, and negotiations with other institutions, which previously had to be consulted on-site upon request. These archives cover the period from 1952 to 1994, in accordance with the “30-year rule”, and the developers emphasize the importance of trust and reliability. Crucially, the AI used relies solely on documents within its database to provide responses, significantly reducing the risk of “hallucinations”. The presenters then discussed notable contributions in the field of IT, such as Edgar F. Codd’s establishment of the theory of data management in tables and their relationships in 1970; the role of Google in advancing OCR technology; Sparck Jones’s essential work on measuring term frequency within documents in 1972; and other advancements, including those by Mikolov and recent developments in generative AI, such as ChatGPT, since 2020. They also introduced the dashboard of the EP Archives, which allows researchers to search for documents by metadata, filtering by language, type and year. Additionally, they presented tools like “Ask the EP Archives?”, a question-and-answer tool, a “similarity search”, which retrieves documents with similar keywords based on “confidence” scores, along with other tools we outlined in our homework on the dashboard. During the Q&A session, certain limitations were highlighted, such as concerns about the reliability of AI in archival practices, the spread of misinformation, and challenges related to the multitude of languages, particularly Romanian, which the AI struggles with, often providing responses in English instead. Moreover, the limitations of available documents are confined to those of the European Parliament. 289 words

I personally found this presentation to be very interesting and enlightening, though I also found It somewhat challenging to follow due to the abundance of technical terms primarily related to IT and AI, areas with which I am still becoming familiar. However, the presentation was engaging, and I truly appreciated the talk-and-show method employed by the presenters.