Session Summaries by Stefan Ignjatovic
Data, Metadata and Tropy, 25.09.2024
In the ‘Introduction to Digital History’ course on 26 September, we dealt with the topic of ‘data’ and were told exactly what ‘data’ is and what different types of ‘data’ there are. In the course, we were told that ‘data’ are (digital) things that we can create or collect, for example. The course also showed us what other types of data there are, such as ‘research data’ or ‘metadata’. The course also taught us what metadata is, if you must explain it very simply, it is data that describes other data. We also learnt about metadata from a video. This is information that describes sources such as documents or images. We also learnt why metadata is important because it helps you to organize information or knowledge. After the explanation about metadata, we saw examples of metadata to understand it better. As the last point of the course, we saw data repositories that are used to store, preserve, publish and share data, such as Zenodo, figshare or Havard Dataverse. Data repositories make it easier to access and retrieve data. We were also explained what Tropy is. This is software that is used to organize or describe photos of research material. One possible criticism of the course is that I personally missed a short introduction or explanation of the Tropy software in the course. I felt that the rest of the lessons had a good structure. I just found there was a lot of information at once and I also found it difficult to keep up with everything in the course. But I also understand that the topic of the course is a complex one.
Web archives, 02.10.2024
The class of 2 October in the Introduction to Digital History course began with us being divided into groups of three and each group choosing one of seven assignments. The seven assignments were Assignment 1 - Stakes of archiving the web, Assignment 2 - Luxembourg Web Archive, an overview, Assignment 3 - Archiving luxembourg.lu, Assignment 4 - Collection social media (You Tube) data, Assignment 5 - Fluidity of the web, Assignment 6 - Publishing family and personal archives in the web and Assignment 7 - Crowdsourced born digital archives. Assignment 5 was not chosen by any group because there were only six groups in the class. I was in the group with Charel and Andre and we chose Assignment 1, which consisted of reading the article ‘We’re losing our digital history. Can the Internet Archive save it?‘ and answering the following questions, ‘What is the main problem the article identifies?’, ‘What is the Internet Archive (IA)?’, ‘What are the solutions IA provides?’ and ‘What are the main impediments IA faces to accomplish the mission it undertook?’. Each group was given 30 minutes to complete their assignments so that they could later present their results to the other groups. The results of my assignment were that the Internet Archive preserves websites, books, etc. to prevent them from disappearing and being forgotten. Because according to the article we read, 25% of websites that were created between 2013 and 2023 have disappeared. We also found out that IA has several difficulties to realise their project, because due to cyberattacks or litigation they lose their stored websites or music etc. and they must restore the data if they are not completely lost. Since IA is not financially supported, their problems are exacerbated.
Impresso, 09.10.2024
The Introduction to Digital History course on 9 October 2024 dealt with the digital platform Impresso. In the first part of the course, we were given explanations of what Impresso is. Impresso is a digital platform that allows you to access old newspapers and to examine and analyse them. Impresso has digitised various newspapers, such as Swiss newspapers like Gazette de Lausanne or Luxembourgish newspapers like Luxemburger Wort. We were also given an insight into the various filter functions offered on Impresso and how they work. In the second part of the course, we had to do group work. In this work, each group had to deal with a filter function. I was a member of group 8, and we had the filter function ‘Text reuse’. This function is used to search for a specific term in the newspaper articles or to analyse how often this term appears in other newspapers. This function is therefore used to analyse how this term was present in certain periods and how important they were, but also to analyse how the text has developed over time. The result of our group work was that we could not find much useful information. The reason for this is that the filter function ‘Text reuse’ did not work properly. The filter did not recognise useful passages on our topic, ‘League of Nations’, it recognised numbers, special characters or only half of a word. For this reason, we didn’t get very far in our group work, but we were able to try out and test this filter function, which was also the aim of the group work.
Storymaps, 16.10.2024
In the course on 16 October 2024, we saw the ‘Storymaps’ programme. In this course, after the explanations of what maps are, we formed groups, and each group dealt with its own topic. My group’s topic was Preserving Society Hill, which was a map of a neighbourhood in Philadelphia. The task for the group was to prepare a presentation on story maps based on the given questions. My group tried to use the programme to see what was possible, but there wasn’t enough time to complete the task, so we weren’t able to finish our presentation to our liking. Another group had the John Snow Map as their theme, at the beginning of their presentation on Storymaps they explained who John Snow was. Then they put the first layer on the map, which was the map of Soho from 1854. The map dealt with the outbreak of cholera in London in the same year and the group explained what the consequences of the outbreak were. In layer 2, the group added the death toll and the larger the circle, the higher the death toll. The map should be the first step in confirming Snow’s theory. Layer 3 contains the water pumps of Soho, thus verifying the second step of Snow’s theory, which says that the spread happens via water and not via air. In layers 4 and 5 we can see that the water pump in Broadwick Street was the ‘hotspot’ for the outbreak, as there was a high death toll in the radius of the map, which can be seen on the map.
Palladio, 23.10.2024
I was not present for the course on 23.10.2024 and for this reason I did the task we were given in this course on my own at home and familiarised myself with the Palladio programme on my own. The Palladio programme allows us to visually represent the connection between different people or other things. The first step in this task was to create an Excel table in which we had to enter all the information and indicate the connection between these people. Later, we were to enter this Excel spreadsheet into the Palladio programme and the programme would process the information so that we could then choose how to visually view the connection between, for example, people in a diagram. The Palladio programme seems practical, and I could imagine that it could be very helpful for my master’s thesis. Because in my master’s thesis I also have to collect several pieces of information and find the connection between them. For this reason, I believe that Palladio can help me a lot to find the connection in the confusing information more easily and to understand it more easily. I also got on very well with the programme during the task and in general I can say that I find this programme very practical for organising and understanding large amounts of information. What I sometimes found problematic in the task was that the diagrams that visualise the connections were themselves somewhat confusing. However, I think this was because I had too much information in the Excel spreadsheet. However, I did not try to do the task with a smaller Excel spreadsheet.
EU Parliament Archives, 30.10.2024
On 30 October 2024, we attended a presentation given by the EU Parliament Archives. Ludovic Delepine and Marco Amabilino showed us what is possible with the digital archives of the European Parliament. The archives contain documents dating back to 1952. The documents can be viewed by the general public. However, only documents published between 1952 and 1994 can be viewed, due to data protection regulation that only allows documents to be made public after 30 years. The EU Parliament Archives also use an AI to analyse the documents and to find the documents you are looking for. However, at the presentation of the EU Parliament Archives, I had difficulty following the explanations due to the use of technical terms. There were simply too many technical terms, which meant that I didn’t understand everything that was explained in the presentation. The EU Parliament Archives are helpful because they have several filter functions that allow you to search for documents by year, by specific terms, in specific languages or from specific countries that can contribute to your own research. There is also the ‘Ask the EP Archives’ function, which allows us to ask an AI about a specific topic and this AI then searches for documents that match this topic. This function is also helpful for me, as I have a course that deals with the history of European integration. In this course I have to give a presentation, and I think I can find documents in the EU Parliament Archives that can serve as a source for my presentation.
DH Theory: Criticisms; Transparency; Reproducibility/Documentation, 06.11.2024
Unfortunately, I was unable to attend this course and tried to understand the content of this course using the PowerPoint. The topic of this course was DH theory. In the first slide of the PowerPoint the question ‘How do we know what is true?’ was asked and on the second slide the question ‘Interpretation or Forgery?’ was also asked. I assume that they tried to find an answer to this question using the example of David Irving. It turned out that David Irving spread false information, such as Holocaust denial, and evidence was found to prove that Irving used fake sources. It was then stated on the slide that we can prove bad historical scholarship based on written records. However, the question is how we can do this for data-driven research. Then a method is presented that consists of 5 steps. The first step is selection, in which we decide which data we want to analyse. The second step is modelling, in this step we look at how we structure and present the data. The third step is normalisation, in which the data is standardised. The fourth step is called linking, in which connections are established between different data and sources. The last and fifth step is called Classification, in this step the data is classified into certain categories. Using this method, we should be able to prove that others also receive the same data or the same result as we do. If I have understood this correctly from the PowerPoint on Moodle.
Scalable Reading and Voyant Tools, 04.12.2024
As part of course on distant reading, our assignment was exercise number 3. This exercise involved transferring the comments from a YouTube video titles “Luxembourg: Poverty in Europe’s wealthiest country” into the Voyant Tools interface. The work was divided into three main parts, each exploring different features and possibilities offered by this tool. The first part of the exercise required us to explore three specific features of the interface: a word-cirrus, the list of terms, and links. First, the word-cirrus tool projects the most common words as a word cloud, with size and colour varying based on frequency. We could also adjust the number of words displayed to better visualize trends. Second, using the list of terms, it was possible to count how many times a specific word appeared in the comments, with positive terms, like “rich”, highlighted in green and negative terms, like “poor”, in red. Third, the links feature highlighted a network of the most frequent words, distinguishing primary terms in blue and their associated terms in orange, with the same “adjustment” feature possible. For instance, the word “people” (in blue) was linked to terms like “government” or “Luxembourg” (in orange). These features allowed us to visualize the vague opinions expressed in the comments and understand the general direction of the video’s main subject. In the second part, we explored two other features of Voyant Tools: contexts and collocations. First, the context tool displayed the sentences or passages where a specific was used, enabling us for example, to distinguish positive comments from negative ones and analyse options in more detail. Second, the collocation tool showed how often certain words appeared together in the same context. For example, the words “poverty” and “Europe” appeared together nine times in the comments, highlighting a significant connection between these two words. These analyses helped us understand the frequent and less-frequent relationships between terms and identify recurring themes in the comments. Finally, in the third part, we reflected on potential applications of this interface in other domains. We proposed three ideas. First, for archival research, this interface could be used to analyse historical archives. For instance, one could identify how often a specific subject, such as the Luxemburgish communist diplomat “René Blum”, is mentioned in a corpus of archives and examine the contexts in which it is referenced, like “communism” or the “Soviet Union”. Second, Voyant Tools could be employed to analyse newspaper articles, particularly the most used words used by a specific journalist or author in a publication, to conduct critical discourse analysis. Moreover, this critical discourse analysis can also be applied to topics like World War II, the Cold War, or the Crusades, enabling the user to analyse how opposed parties describes such events. Third, this tool could track the evolution of an author’s work overtime. For instance, it would be interesting to analyse how a writer’s style or themes, that lived in Germany, changed before, during, and after World War II. This exercise thus allowed us to explore the many capabilities of Voyant Tools in text analysis and to consider its practical applications in various contexts.
Dissemination (Part I), Case Studies, 11.12.2024
In the course on 11 December 2024, Joëlla van Donkersgoed gave a lecture in which she introduced us to the concept of public history and how it can be made accessible to everyone. She introduced us to the problem of public history, that it is often influenced by external factors, such as political factors. This influence made it clear that history is always written by the victors and that minorities are marginalised and ignored as a result. Joëlla van Donkersgoed also introduced us to the aims of public historiography, one of which is to include perspectives and sources that were previously ignored. Another goal is to involve the public in collective memory. We were then presented with some examples, such as HistoreschGesinn, a website that encourages people to participate in historical projects, as users can submit their contributions on this site. The site also provides information about events on historical topics and users can exchange ideas. Another example is HistoreschEsch, a project in which the residents of Esch can vote on a mural representing their town, for example. I found the course interesting because I am very interested in Luxembourg’s history, and I didn’t know some things until after I had familiarised myself with the projects that were presented to us in the course. My example makes me realise the value of the projects in bringing history to the general public.
Dissemination (Part II) & Case Studies ‘Journal of Digital History’; Data papers, 18.12.2024
In the last class of the semester, the topic was ‘Dissemination of Scientific Results II’ and we focussed on scientific publishing. Scientific publishing is the subfield of publishing which distributes academic research and scholarship and the goals that are pursued are: ‘making visible given research’ and distinguishing researchers. This is to be achieved by using books or academic journals as a channel for dissemination. We also learnt that scientific publications, as these are considered, must be reliable and reusable. In class we also learnt about the concept of ‘peer review’, which is a process in which a paper is evaluated by one person by another expert in a particular field. This process is also important for scientific publication, as it provides evidence of the value of the work and that the work is accepted in academic circles. Academic publishing as a subdomain of the publishing industry has several obstacles to overcome. For individuals, these include the pressure to publish research consistently, the accessibility of papers and the monopolisation of the market. The large publishers can determine the subscription formats, which limits the accessibility of scientific publications and leads to great inequality. This inequality gave rise to the Open Access movement, which aimed to provide free access to scientific publications. We have also learnt about the new form of scientific publishing. This was called data publishing and focussed on the description of data sets and their accessibility. In the second part of the lesson, we formed groups as a class and each group followed a task. My group read the article ‘Dialects of Discord. Changing vocabularies in the Dutch Cruise Missile discussion’ and later we presented our findings to the class.