Session Summaries by Chiara-marcucci

written by: — October 11, 2024

[Data and Metadata], [2024-09-25]

The “Data & Metadata for Historians” course led by Sofia Papastamkou taught us and especially me how historians can engage with historical sources in the digital age. It emphasized the importance of evolving research methods through digital technologies and data management. We learned to distinguish “digital history,” which incorporates digital sources and tools, from merely “digitized sources”, often analyzed with techniques like artificial intelligence. A key focus was on understanding “Data,” revealing that it consists of interpreted elements actively constructed by researchers. We also explored different types of “Metadata,” essential for organizing, interpreting, and sharing historical resources by how I interpretated it. For instance, we examined a timeline created by users on a website that illustrated how metadata provides temporal context by being statistically classified under “created, digitized, found by you”. A Flickr example showcased a tweet explaining the incorporated metadata, highlighting its role in enhancing our understanding of digital sources. These examples helped me to better understand the meaning of “Data” and of “Metadata”. Finally, we were introduced to “Tropy,” a software tool for managing and archiving digitized historical sources. I had not encountered this tool before, but it could prove invaluable for future projects. Overall, this course has enhanced our understanding of how data and metadata enrich historical research methods, equipping us with practical tools for navigating digital history. While I found the information presented by Ms. Papastamkou on September 25 challenging, the course clarified the distinctions between data and metadata, which I previously overlooked. Also, Ms. Papastamkou demonstrated understanding of our difficulties, providing a supportive environment that made us feel acknowledged and unhurried.


title: Session Summaries by Chiara-marcucci abstract: Summary-2 authors:

  • Marcucci-0211287606 date: 2024-10-11 —

[Web Archives], [2024-10-02]

The course Web Archives” was a refreshing departure from traditional lecture-based learning. Unlike previous sessions, where the professor spoke and the students passively listened, this session was much more interactive and engaging. From the start, we were actively involved in the learning process, divides into groups, and given the freedom to choose assignments. This gave the experience a more personal and collaborative touch. My group chose Assignment 3, “Archiving Luxembourg.lu”, where we explored the evolution of Luxembourg’s official web portal. Using the Wayback Machine, we traced the site’s history by entering the URL www.luxembourg.lu. it was fascinating to see how the website had changed overtime. We examined the information and resources available in various archived version, comparing the earliest and most recent pages. Analyzing the shifts in content, design, and accessibility, we gained valuable insights into how Luxembourg’s digital presence has evolved – an intriguing discovery for all of us. After conducting our research, we presented our findings in a five-minute presentation. Although we did not have as much time to prepare as we’d have liked, the feedback from our professors was very helpful. Their thoughtful questions, not only for our group but for every single group, encouraged us to think more deeply about the subject. Each group’s presentation was unique, offering a wide range of interesting topics and perspectives. I found the last group’s project particularly engaging, as they added insights that made their presentation more relatable. Overall, this hands-on approach made the course far more enjoyable and memorable. Instead of just learning facts, we applied what we learned, making the experience both meaningful and fun.


title: Session Summaries by Chiara-marcucci abstract: Summary-3 authors:

  • Marcucci-0211287606 date: 2024-10-12 —

[Machine learning - Impresso], [2024-10-09]

In our course held on the 9th of October 2024 on machine learning and historical media, specifically on the project Impresso, we explored together with Mr. During how this program links data, individuals, and various disciplines to analyze historical newspapers, offering a glimpse into the traditions, values, and everyday experiences of past societies, touching on areas such as clothing, cuisine, and daily habits. Now digitized, they can be efficiently searched through machine learning techniques. During the demo, we explored the Impresso platform to track the frequency of specific terms in old newspapers, using thematic filters, or “lenses”, to visualize word usage across different sections. We compared terms like “Atomkraft” and “Nucléaire”, analyzing their occurrence in various countries over time. Additionally, we were introduced to the concept of “tokens”, small data units such as words, which are used to quantify term frequency. We also discussed OCR’s; a concept I was not familiar with before but often encountered when searching for particular words. Later on, we were divided into pairs for a hands-on activity. Indeed, Océane and I used the “Ngrams” tool to compare both our hometowns “Calmus” and “Bascharage”, finding this interactive method more engaging than passive listening. The Impresso project reminded me of the Luxembourgish site eluxemburgensia, which I find an excellent resource for primary sources and newspapers. In the final session, Mrs. Papastamkou introduced us to the platform GitHub, but the rushed presentation left many confused. Fortunately, a follow-up email provided the necessary clarification, helping us better understand the tool and its setup. In the end, it turned out to be a very interesting and useful platform.


title: Session Summaries by Chiara-marcucci abstract: Summary-4 authors:

  • Marcucci-0211287606 date: 18-10-2024 —

[StoryMaps], [16-10-2024]

In class on the 16th of October 2024, together with Mrs. Schmid, we were introduced to StoryMaps and learned how to use it through group exploration. My group worked on the project of the Olympic Games, specifically the History of the Olympic Marathon from 1896-2020, where we each explored different periods. It was very exciting to see athletes’ names, birthplaces, race times and paths visually represented on satellite maps. For example, comparing marathon times like 2h20min26sec in 1968 and 2h08min38sec in 2020 showed how the sport evolved. While the maps informative, we notice some limitations like the lack of scale and limited zoo, which made distances somehow hard to grasp. Still, it was fascinating how athletes improved over time and understand their journeys through these visual maps. Another project thar caught my attention was the John Snow Map on the 1854 Cholera Outbreak. Apart from being able to make a good StoryMap of their project, the group showed how John Snow challenged the “miasma theory”, which blamed air for spreading disease. By what I saw, they used different map layers: one showing “Soho during the outbreak”, where 10% of the population died in a week, another highlighting death tolls, and a third showing water pump locations. Also, from what I understood, their spatial analysis revealed that 68% of deaths occurred near the Broadwick Street water pump, which showed that Snow’s theory that water contamination was the real cause of the deaths was correct. To sum it all up, along with our project, made me appreciate the power of StoryMaps in visualizing complex data. I look forward to exploring this tool more in the future.


title: Session Summaries by Chiara-marcucci abstract: Summary-5 authors:

  • Marcucci-0211287606 date: 2024/10/25 —

[What have networks ever done for us ?], [2024/10/23]

In this session with Mr. During, we explored social networks using a wedding as an example. At weddings, connections are formed based on the guest list, reflecting existing relationships. Seating arrangements are crucial, grouping people by similarities and separating others to avoid conflicts. For instance, older guests sit together, singles at another table, and travelers in their own group. Each table has its own communication patterns, and we calculated how much singles talk to each other, a concept called “edge weight.” We also covered how rumors spread. The professor illustrated how a guest from another table leaves to share a rumor at the singles table, becoming a “broker” who spreads information between groups. We learned terms like “nodes,” representing people or things, as well as unipartite connections (like at a wedding) and bipartite connections (two types, like years after the event). We also discussed “affiliations,” also interpreted as fixed groups, and “interactions,” which involve passing information like rumors. This marriage example helped clarify the next steps we needed to take. Ocean and I worked on our own Palladio project called “Barbie marriage”. In this project, we used Barbie characters to demonstrate how networks work. We connected the characters (nodes) and illustrated their relationships such as “likes” and “does not like”. However, we were unable to integrate the affiliations (interests) into our Palladio map. Overall, I think Palladio has the goal to show us how people are connected and helps us understand how ideas or information move between them. Indeed, it organizes complex relationships that our memory alone can’t easily handle.


title: Session Summaries by Chiara-marcucci abstract: Summary-6 authors:

  • Marcucci-0211287606 date: 2024/11/01 —

[EU Parliament Archives Presentation], [2024/10/30]

Session Summary

On October 3rd, 2024, I attended an engaging presentation entitled “Hands on History: EU Parliament Archives”, presented by Ludovic Délépine and Marco Amabilino. Their focus was on how artificial intelligence (AI) can revolutionize access to parliamentary documents. We started with a captivating video that briefly highlighted the question “who is Louise Weiss?” This significant figure in European integration intrigued me, especially since my master’s thesis will explore similar themes. A standout aspect was the Archives Unit Dashboard of The Historical Archives, a tool that we also got to explore from home before the presentation. Now available for public download, users can search documents based on the metadata with options in the dashboard to filter by type, language, and year. It aims to democratize knowledge of European Parliament history and serve as a valuable tool for researchers and citizens alike. The presenters also presented technologies like Anthropic Claude and Constitutional AI, designed for multilingual archive exploration, stressing the need for deeper insights into document content, as metadata alone often lacks detail and that it will always require critical evaluation by the user. The presentation emphasized influential data organization techniques by Edgar Codd, a pioneer in data management. Sparck Jones, who worked on term frequency-inverse document frequency, was also mentioned.

As I work on my master’s thesis about the role of women in the European Parliament, especially in Luxembourg, I see the EU archive and this ChatGPT-like tool have invaluable resources that will provide essential support for my research. After I asked my question about the helpfulness of the EU Archives in relation with my thesis in the Q&A session, one of the presenters offered guidance on accessing additional information about EU Archive for my thesis, which I greatly appreciated.

Session Review

The European Parliament’s Archives Unit Dashboard is a useful digital platform that provides access to over 2,000,000 digitized documents from 1952 to 1994 including important records from the European Parliament and other key assemblies. The dashboard aims to democratize knowledge about the parliament’s history offering valuable resources from researchers and the public. As users explore the site and its tools, they’ll find 4 specialized sections tailored to different research needs which enhance the efficiency and straightforwardness of their searches. Firstly, the EP Archives Overview Dashboard offers a detailed entry point to the European parliament’s vast document collection. Using charts, it presents data that allows users to easily explore and understand the collections scope categorized by language document type, year, and start date. This visual organization helps users grasp the diversity and historical timeline of the records making it simpler to navigate the development of parliamentary documents over time and access a wide selection of materials from multiple perspectives. Secondly, the EP Archives Content Analysis Dashboard offers researchers access to significant collection of around 30,000 original European Parliament documents, covering motions in both written and oral questions from 1958 1984. It features advanced tools designed to enhance search efficiency and effectiveness. For example, users can utilize the “Top Words” feature, which presents frequently used terms in a word cloud or the “Select Top Words” option for direct searches of specific terms. Additionally, the “Dominant Topic” tool organizes documents into eleven main themes, making it easier to find specific content, while the “Related Documents” feature connects thematically or linguistically similar records. Also, to help visualize topic connections, the “Intertopic Distance Visualization” displays each topic as a circle, with the distances between them indicating thematic similarities. Then, there is the “Alternative Topic Visualization”, emphasizing word frequency within a specific subject as well as across the entire collection of documents. This tool helps users easily identify terms unique to a particular topic and those that are commonly found throughout the dataset. For the third section, there is the EP Archives Received Request Dashboard, that tracks all requests to the Archives Unit since 2020, revealing global interest in parliamentary records. An interactive map displays each country’s request volume, highlighting which nations request documents most often. Annual statistics further categorize requests by organization type, such as European institutions, civil society groups, research labs, and national chambers. This breakdown showed strong collaboration and international involvement with the European Parliament archives. The final section features the Tool Dashboard, which offers two essential resources for researchers. One of these is the “Document Summarizer”, enabling users to import documents and generate brief summaries in Word or PDF formats, with adjustable lengths based on specified word counts. The other tool, the “Eurovoc Document Tagger”, serves to classify documents by assigning Eurovoc tags.

These features really enhance how I can access and use the European Parliament’s archives, making it easier to find, analyze, and request historical documents. By simplifying searches, these tools will help me dive deeper into Europe’s legislative history, making knowledge more accessible and enriching my understanding of the stories that shape our political landscape. This resource could be invaluable for my future master’s thesis, helping me gather the insights I need for my research.

Questions:

• How does using AI in archives raise ethical concerns about bias, transparency, and accountability, and what can archivists do to address these issues?

• What future opportunities do archivists have with AI, especially for managing digital records and helping underrepresented communities access them?

• What considerations should be considered when choosing to use secondary sources over primary sources in the European Parliament Archives, and what potential pitfalls should archivists be cautious about in this decision-making process?

• How does AI affect how we decide which records to keep for the long term, and can it really take the place of human judgment in this process?

• How do mistakes made by those transcribing or managing documents in the EU Parliament Archives affect the accuracy and reliability of the historical record, and what impact do these errors have on the overall archival system?