Session Summaries by Altan-Ricci

written by: — October 11, 2024

Data, Metadata and Tropy, 25-09-2024

This course explored how historical practices have evolved in the digital era. The beginning of the course differentiated between “digital history”, which involves leveraging new technologies, and “history in the digital age”, where historians work with digitized sources. A key focus of the course was on the crucial role of data management in the process of historical research. Initially rooted in mathematics through Euclid (4th c. A.D.), the concept of “data” broadened in the 18th century. Today, “data” refers to interpreted elements actively shaped by researchers. This digital revolution has led to an abundance of different types of data that were once scarce. The course emphasized that digital transformation affects every stage of historical research, from source discovery and analysis to interpretation, dissemination, and preservation. In this context, “data” becomes “research data”. Historians engage with diverse sources, ranging from texts to physical artifacts, which are structured through “metadata”. Central to the course, “metadata”, defined simply as “data about data”, is essential for organizing and interpreting historical materials. We examined various examples of metadata, from digital heritage collections to social media posts and online photo archives, highlighting its significance for historians. We also learned about the role of data repositories in storing and preserving different types of data, ensuring future accessibility for research. In the final section of the course, we were introduced to “Tropy”, a software tool designed for managing and archiving digitized historical sources. It allows historians to explore innovative ways of organizing and interpreting digital materials, offering valuable support in their research endeavors. 255 words

I personally find it challenging to shift from traditional historical methods to digital approaches. The course feels quite technical and theoretical to me. Reviewing the slides at home helped clarify some concepts, but not all, which left me feeling like I was attending a Computer Science class rather than a history course. Additionally, the pace of the lectures was too fast, and I struggled to fully grasp the professor’s explanations.


title: Session Summaries by Altan-Ricci abstract: Summary-2 authors:

  • Ricci-0210959012 date: 2024-10-11 —

Web Archives, 02-10-2024

This course introduced to us web archives. The course emphasized interaction, as it was divided into the following two parts: The first part involved forming groups of three or four people, and these groups had to choose one of the seven topics proposed by the professors. The second part was about presenting and discussing the subjects with the class. The first topic focused on the importance of digital archives and the preservation of websites, highlighting the Internet Archive. The second topic, titles “Luxembourg Web Archive”, which was the subject chosen by Kenan Korac and me, focused on the National Library of Luxembourg, where we explored the missions, policies, organization, limitations, and collaborations regarding the digital heritage preservation of the BNL. The next topic, titles “Archiving.lu”, involved students exploring the evolution of the “Luxembourg.lu” website using the “Wayback Machine”, a digital archive of websites from the Internet Archive. The fourth topic, titles “Collecting Social Media (YouTube) Data”, emphasized how to use YouTube comments as data for research, and students also explored their individual digital footprints. The fifth topic was dedicated to the fluidity of the web and focused on various digital technologies. The next topic revisited individual footprints on the web and how they are archived, such as linking an individual to a sport’s club’s website. Finally, the last topic centered on the digital archive of September 11, 2001, where we discussed the historical value of photos on this specific site and the associated risks. 244 words

I personally found the course very interactive, and much better than the first one, because it was much less theoretical. The fact that we could read and discuss different kinds of web archives in groups and then have a class discussion made it easier to understand the importance of such archives.


title: Session Summaries by Altan-Ricci abstract: Summary-3 authors:

  • Ricci-0210959012 date: 2024-10-13 —

Impresso, 09-10-2024

After exploring the presentation of this course, I gathered that Impresso enables historians to explore and analyze collections of digitized historical newspapers from Luxembourg and Switzerland (1739-2018) using natural language processing tools and interactive visualizations. It simplifies content searches with semantic filters while ensuring transparency regarding data quality and processing methods. Additionally, Impresso utilized various filters and detects OCR errors, enhancing the accuracy of searches for historians and making the research processes more precise. Since I didn’t receive a response from the Impresso team, I had to conduct my own experiment using Chiara Marcucci’s account. I experimented with the “Ngrams” tool, where I used the names of my city, “Petange” in French and “Petingen” in German. The word, or token in NLP, appears 21.083 times in French and 40.897 times in German, across a total of 39.413 articles. What stood out in my experiment was that both words appeared only 168 times in Swiss articles, meaning they were predominantly found in Luxemburgish newspapers. Furthermore, using the “Inspect & Compare” function, I discovered that both words appeared most frequently in the “Luxemburger Wort”, with the French version peaking in 1936 (9.34 ppm) and the German version in 1941 (21.54 ppm), possibly due to Nazi occupation of Luxembourg during World War II. Moreover, the common results were found most often in the “Escher Tageblatt” with 42 occurrences. In this way, Impresso allows for the exploration of a substantial corpus of digitized articles, making historical research more organized and significantly improving data management. 250 words

Personally, I found that despite missing the class, exploring and experimenting with Impresso at home helped me understand the functionality of the interface and its importance for historians working with Swiss and Luxemburgish newspapers. It is indeed a practical tool that helps better organize research and find relevant sources for such studies.


title: Session Summaries by Altan-Ricci abstract: Summary-4 authors:

  • Ricci-0210959012 date: 2024-10-16 —

Maps, 16-10-2024

After a brief introduction covering the key elements of maps, this course primarily focused on group work using “StoryMaps”. This interactive tool enables storytelling through maps, and our group used it to create a story map based on John Snow’s famous map of the 1854 cholera outbreak in London. Our experience with the tool was very positive, and we quickly found an option that allowed us to present the unfolding of the story by adding different “layers” to Snow’s original map, which depicts the Soho district. Starting with the base, we added various colored data layers, such as the death toll addresses, the locations of water pumps, and finally, the spatial mean and standard distance. Alongside each new layer, we included bullet-point information that addresses the questions of the exercise. The story map proved to be highly effective in illustrating Snow’s method for demonstrating that cholera was spread through contaminated water, and that the disease’s “hotspot” was located near the Broadwick Street water pump. Furthermore, by looking at the story map on the Olympic Marathon from 1896 to 2020, we noticed some similarities with our own project. Here too, the story is told through maps, showing the marathon locations over time, along with the runner’s route and the time taken to complete the race. It’s an interesting way to trace the runner’s journey geographically. However, as the group that worked on it rightly pointed out, this type of story map carries certain risks, particularly the potential for anachronism. Thus, we learned that maps could serve as valuable historical sources and be useful in multidisciplinary fields like epidemiology, but they must always be viewed critically. 274 words

Personally, I found this course and the exercise both engaging and insightful. While I already understood the importance of maps as historical sources, I learned that they can be applied in many other contexts as well. Moreover, working with the “StoryMaps” tool was enjoyable and proved to be a diverse way to present a historical event.


title: Session Summaries by Altan-Ricci abstract: Summary-5 authors:

  • Ricci-0210959012 date: 2024-10-26 —

Networks and Palladio, 23-10-2024

This course was dedicated to the importance of networks in historical research. To introduce us, as networks are flexible, we used the example of a wedding guest list, where we encountered our first technical terms such as “Reciprocity”, “Ego-Network”, and “Network Boundaries”. By visualizing the table arrangements based on “Attributes” like age or relationship status, and through examples of conversations, we realized that there can be numerous, or even very few, connections within a table. We learned the significant role of a “Broker”, who can pass information from one table to another, thereby creating a link between the two. To continue the course, we saw map examples that illustrate networks, such as Facebook friends, where we observed that China and Russia have their own networks, and in parts of Africa where there are no people or internet, these areas appear as blank spaces. Hence, maps can be, once again, interesting but can also be overwhelming at times, which is why data visualization is valuable. We then looked at historical research examples, such as the rise of the Medici family to power, Gestapo interrogations in Cologne, and the Jewish support network during World War II. We saw that data visualization allows historians to uncover connections they may have overlooked or not noticed, pushing them to explore individuals further and recognize their significance, as Prof. Düring did in his PhD with Walter Heymann. To conclude, we were able to create our own network on Palladio, a digital tool for visualizing and analyzing data using various instruments Kenan and I focused on the relationships forged in New Caledonia during our mobility semester. 269 words

Personally, I found this course very entertaining and well-structured. Moving from basic examples to maps and then to data visualizations in historical research was enjoyable and easy to understand. Even though there is still a vast field to explore, using Palladio was also enjoyable, and the course was highly engaging and interesting.


title: Session Summaries by Altan-Ricci abstract: Summary-6 authors:

  • Ricci-0210959012 date: 2024-11-01 —

EU Parliament Archives, 30-10-2024

This course focused on a lecture presented by Ludovic Delepine and Marco Amabilino about the digital archives of the European Parliament and the integration of AI in research. Their primary mission is to democratize access to millions of documents, including legislation, resolutions, positions, and negotiations with other institutions, which previously had to be consulted on-site upon request. These archives cover the period from 1952 to 1994, in accordance with the “30-year rule”, and the developers emphasize the importance of trust and reliability. Crucially, the AI used relies solely on documents within its database to provide responses, significantly reducing the risk of “hallucinations”. The presenters then discussed notable contributions in the field of IT, such as Edgar F. Codd’s establishment of the theory of data management in tables and their relationships in 1970; the role of Google in advancing OCR technology; Sparck Jones’s essential work on measuring term frequency within documents in 1972; and other advancements, including those by Mikolov and recent developments in generative AI, such as ChatGPT, since 2020. They also introduced the dashboard of the EP Archives, which allows researchers to search for documents by metadata, filtering by language, type and year. Additionally, they presented tools like “Ask the EP Archives?”, a question-and-answer tool, a “similarity search”, which retrieves documents with similar keywords based on “confidence” scores, along with other tools we outlined in our homework on the dashboard. During the Q&A session, certain limitations were highlighted, such as concerns about the reliability of AI in archival practices, the spread of misinformation, and challenges related to the multitude of languages, particularly Romanian, which the AI struggles with, often providing responses in English instead. Moreover, the limitations of available documents are confined to those of the European Parliament. 289 words

I personally found this presentation to be very interesting and enlightening, though I also found It somewhat challenging to follow due to the abundance of technical terms primarily related to IT and AI, areas with which I am still becoming familiar. However, the presentation was engaging, and I truly appreciated the talk-and-show method employed by the presenters.


title: Session Summaries by Altan-Ricci abstract: Summary-7 authors:

  • Ricci-0210959012 date: 2024-11-09 —

DH Theory, 06-11-2024

To introduce this course, we each shared our thoughts on the U.S. presidential election won by Donald Trump. Given his frequent criticism of the media, we discussed how to discern the truth of information. To transition into the course material, we examined author David Irving, who falsified information about the destruction of Dresden, and most notably, claimed that Hitler was unaware of the Holocaust, with later even stating that it never happened. For this, Irving was widely criticized by numerous historians, including Richard Evans, and Deborah Lipstadt, who directly countered his Holocaust denial. After years of rigorous work revisiting archival documents for validity, analyzing his speeches, personal archives, and publications, it was finally possible to prove in court that he was deliberately faking evidence and drawing false conclusions, highlighting that, although it should be easy to prove someone wrong, it often isn’t. This led us to the question: how can one prove that a historian has deliberately falsified history and primary source evidence? Building on this question, we applied these ideas to data-driven research, notably the databases we created two weeks earlier in Excel and Palladio, following five key steps of digital history theory: Selection (choosing or excluding data); Modelling (structuring and conceptually representing data); Normalization (standardizing data for consistency and comparability); Linking (establishing connections between data from different sources); and Classification (grouping data into coherent categories). Finally, we reviewed the F.A.I.R. principles, which all historians should follow to make data “findable”; “accessible”, “interoperable”, and “reusable”. 246 words

Personally, I found this course engaging, as discussing current events isn’t common in an academic setting, and it was enriching, especially as a transition into the topics of falsified historical narratives and digital history. The course was well-organized, and the group work at the end, which allowed us to review our database, was also valuable.


title: Session Summaries by Altan-Ricci abstract: Summary-VoyantTool authors:

  • Ricci-0210959012 date: 2024-12-04 —

Scalable Reading and Voyant Tools, 04-12-2024

As part of the course on distant reading, our assignment was exercise number 3. This exercise involved transferring the comments from a YouTube video titles “Luxembourg: Poverty in Europe’s wealthiest country” into the Voyant Tools interface. The work was divided into three main parts, each exploring different features and possibilities offered by this tool. The first part of the exercise required us to explore three specific features of the interface: a word-cirrus, the list of terms, and links. First, the word-cirrus tool projects the most common words as a word cloud, with size and color varying based on frequency. We could also adjust the number of words displayed to better visualize trends. Second, using the list of terms, it was possible to count how many times a specific word appeared in the comments, with positive terms, like “rich”, highlighted in green and negative terms, like “poor”, in red. Third, the links feature highlighted a network of the most frequent words, distinguishing primary terms in blue and their associated terms in orange, with the same “adjustment” feature possible. For instance, the word “people” (in blue) was linked to terms like “government” or “Luxembourg” (in orange). These features allowed us to visualize the vague opinions expressed in the comments and understand the general direction of the video’s main subject. In the second part, we explored two other features of Voyant Tools: contexts and collocations. First, the context tool displayed the sentences or passages where a specific was used, enabling us for example, to distinguish positive comments from negative ones and analyze options in more detail. Second, the collocation tool showed how often certain words appeared together in the same context. For example, the words “poverty” and “Europe” appeared together nine times in the comments, highlighting a significant connection between these two words. These analyses helped us understand the frequent and less-frequent relationships between terms and identify recurring themes in the comments. Finally, in the third part, we reflected on potential applications of this interface in other domains. We proposed three ideas. First, for archival research, this interface could be used to analyze historical archives. For instance, one could identify how often a specific subject, such as the Luxemburgish communist diplomat “René Blum”, is mentioned in a corpus of archives and examine the contexts in which it is referenced, like “communism” or the “Soviet Union”. Second, Voyant Tools could be employed to analyze newspaper articles, particularly the most used words used by a specific journalist or author in a publication, to conduct critical discourse analysis. Moreover, this critical discourse analysis can also be applied to topics like World War II, the Cold War, or the Crusades, enabling the user to analyze how opposed parties describes such events. Third, this tool could track the evolution of an author’s work overtime. For instance, it would be interesting to analyze how a writer’s style or themes, that lived in Germany, changed before, during, and after World War II. This exercise thus allowed us to explore the many capabilities of Voyant Tools in text analysis and to consider its practical applications in various contexts.