IX Héloïse Workshop – European Network on Digital Academic History

11th to 12th November 2019, Leipzig, Germany

Workshop Agenda Registration Venues Call for Contribution Privacy Policy


Workshop Chair


The Héloïse Common Research Model – Interlinked Repositories on Digital Academic History

Nov, 11th: Bibliotheca Albertina, Beethovenstraße 6, 04107 Leipzig

10:00 - 11:30 Session 1

Stefan Kühne, Ulrich Johannes Schneider and Thomas Riechert Opening

Jennifer Blanke, Edgard Marx and Thomas Riechert Application of the Heloise Common Research Model (HCRM) within the project "Early Modern Professorial Career Patterns Methodological research on online databases of academic history" (Leipzig University of Applied Science, University of Leipzig, Herzog August Library Wolfenbüttel, Germany) show abstract

Within the presentation the authors show the use of the HCRM within their research methodology for the research project Early Modern Professorial Career Patterns - Methodological research on online databases of academic history (2017-2021). The project is a joint project between Historians and Computer Scientist. First relevant challenges are pointed out from both perspectives, these are the potential of available data sets, their varieties, as well as the differences regarding the perspectives of their research fields. Consequently the HCRM, a service based layered model has been applied. For each of the three layers: repository layer, application layer and research Interface layer the authors report from their research and show results, such as the Quit Methodology and KBOX.

Stéphane Lamassé, Jean-Philippe Genet and Cédric Dumouza Uncertainty and prosopography: the case of the Studium Parisiense database (Université Paris 1 Panthéon-Sorbonne, France)

12:00 - 13:30 Session 2

Giulia Zornetta and Dennj Solera Another Nodegoat database on academic history? The BO2022 project on the history of the university of Padova. (Università di Padova, Italy) show abstract

The university of Padova was founded around 1222 by a group of students from Bologna and it is considered the second most ancient studium in Italy. Since the year 2022 will be the 800th anniversary of its foundation, the university of Padova is promoting many scientific activities to celebrate its history and shape its memory both as an institution and a scientific and cultural centre. The Department of History, Geography and Classics – DiSSGeA carries out a research project named BO2022. It takes its name from the main historical building of the university, the Palazzo Bo, which still contains both the rectorate and the 16th -century anatomic theatre.

The BO2022 project focused on academic history by following three main lines of research:

Medievalists, early modern and modern historians started to develop the research in January 2019 with the help of experts from other disciplines, such as specialists on data model and data analysis. During the first months of 2019, a Nodegoat database was designed to map the academic population of the University of Padova from its foundation in 1222 to the second half of the 20th century.

This paper aims to present the BO2022 project as a case-study in academic databases of universities by discussing both the methodology used to design the database and some of its limits. Design a database always presupposes different kind of choices but such a wide chronology implies some significative challenges to face. Clearly, various kind of historical sources were produced from the 13 th to the 20th century. Due to archival limits, some of them are not easily available while others are also difficult to approach as comparable data sets (i.e. private charters). Moreover, each line of research has different questions to ask, which means that different information could be needed. For example, the database must include an object about religious confession to research religious freedom in the 16th and 17th centuries, but this is not relevant for the middle ages nor for the period from 18th century onwards. In addition, it must be said that such ambitious project should also match the institutional agenda of the 800th anniversary: the database will be launch on the internet site of the university of Padova as an open-source tool by 2022.

Consequently, the B02022 research team decided to focus, at least at the beginning, only on the students who graduated at the university of Padova, that is to say on the historical sources which registered the students’ degrees: the Acta graduum academicorum (15th -17th centuries) and the modern academic registries (19th -20th centuries). Since no Acta graduum nor matriculation registries have been preserved for the period between 1222 and 1406, a different approach is used for the earlier period, which implied a survey of medieval private charters. This paper examines some of the limits of this choice, especially when trying to detect and analyse the whole mobility of students in the framework of the peregrinatio academica. It also aims to discuss further possibilities of implementation, such as network relations, and show some of the previous results of the research project.

Manuel Llano Regents of knowledge: the social structure of academia in the 17th century Dutch Republic (Utrecht University, Netherlands) show abstract

In my contribution I will outline my prosopographical research within the European Research Council Skillnet project, based in Utrecht University (2018-2022). As part of this research group, I am compiling a comprehensive database with basic biographical and professional data of the teaching staff of all institutions of Latinate higher education in the United Provinces during the period 1575-1715. This encompasses three different kinds of institutions: 5 universities, 11 illustrious schools (without ius promovendi), and 115 Latin schools, whereof only half were managed by more than one individual simultaneously, amounting to an estimated total of 1.800 individual appointments; some 115 positions in the universities, 90 in the Illustrious Schools and 245 in Latin schools per generational cohort. After introducing the data sources, ontology, and collection agenda and strategy, I will sketch the overarching research questions guiding this effort, as well as showing how social sequence analysis and graph theory will help me addressing them. To illustrate this, I will make use of a case study based on the Latin schools staff in the provinces of Gelderland, Zeeland and Utrecht, providing a short analysis of their career patterns, mobility and social standing.

14:30 - 16:00 Session 3

Hermenegildo Fernandes and Armando Norte Lost in translation? The conversion of historical language in economical and computational semantics. Problems, challenges and solutions. (Center for History of the University of Lisbon, Portugal) show abstract

The use of Digital Humanities as a working tool for the Social Sciences and Humanities, such as History, namely the History of Universities, has numerous virtues (the large and ever-increasing information storage capacity provided by IT systems; the ease of making information available on a scale never reached before; the interoperability and comparability of data between different analogous or complementary systems; the vocation for the statistical and quantitative treatment of the collected information; etc.). But these advantages do not entirely solve crucial problems related to the data collection, analysis, and interpretation. Problems that are difficult to solve and which result from the attempt to pour historical data, often fragmentary, casuistic and dealing with very specific terminologies and languages, into a computer language, which the chronological and spatial distance to the objects of study tends to accentuate.

In the case of databases set up to collect and process economic information from medieval universities - such as the database built for the project ŒCONOMIA STUDII. Funding, management, and resources of the Portuguese university - serious difficulties have arisen in terms of effectiveness and operability in pouring historical information into a digital infrastructure. Indeed, the technical resolution of many of these problems was time- consuming, and in some cases, the resolution was not entirely satisfactory, but only acceptable.

Incidentally, the difficulty in the case of the ŒCONOMIA STUDII project is twofold. In addition to the difficulty of translating a vocabulary and an ontology that is eminently historical into a computational language, without significant data loss or bias, there is the additional difficulty of dealing with current economic variables and accounting systems and applying them to earlier historical processes involving categories not always equivalent or comparable.

In this regard, it should be remembered that economics is a relatively new science within the social sciences, which only began to develop from the theoretical point of view from the 18th-century onwards, thanks to the contributions of scholars such as Adam Smith, David Ricardo, François Quesnay or David Hume. In fact, there was no coherent theoretical approach to economics during the Middle Ages - except for the Church's criticism of usury and profit - everything else being the result of often diffuse empirical practices, and mainly marked by chronological, geographical and spatial differences.

Indeed, the application of current economic concepts to the medieval historical reality has not always shown the necessary and desirable acuity. The nomenclature of certain economic vocabularies and languages, such as “assets” and “liabilities”, “ordinary expenses” and “extraordinary expenses”, “current revenues” and “operating revenues”, is often inadequate in view of the social-economic framework of the object of study, posing heuristic and hermeneutical problems of difficult resolution or understanding to the historians. Take, for example, teachers' salaries that do not correspond to their total income, which also included ecclesiastical benefits, tenure of properties or extra university activities. Another very different but widely problematic example is the phenomenon of currency conversions. The absence of a fixed monetary pattern during the Middle Ages, even when it comes to contiguous spaces and times, the frequent currency devaluations, the typical seasonality of the economy, the different labour availability, the asymmetric regional development of techniques and the differentiated ownership of the means of production tends to make difficult or impossible the historical comparability of prices, wages or measures.

In sum, the building of the OECONOMIA STUDII database from the point of view of categorization, collection, and processing of data faced several difficulties resulting from the articulation of three different languages – the historical language, the economic language, and the computer language. It was necessary to develop an interpretative system that would accommodate these translation issues with the lowest risk. The overall objective can be broken down into four partial objectives: 1) to identify and expose the nomenclature difficulties founded; 2) to explain, from a technical point of view, the solutions developed and implemented to counteract the identified problems; 3) to establish the analysis problems; 4) to present the strategies outlined to ensure the reliability of the interpretation.

On the other hand, in terms of the interoperability between the systems developed by the Héloïse network partners, it is intended to discuss the possible interest of incorporating economic information into the databases developed so far - which are essentially prosopographic databases – making use of the experience gathered in the ŒCONOMIA STUDII project, which has already developed its own ontology and language that can be shared within the network.

Kaspar Gubler Dynamic Data Ingestion (DDI): Server-side data harmonization in historical research. A centralized approach to networking and providing interoperable research data to answer specific scientific questions (University of Bern, Switzerland) show abstract

In Digital Prosopography ​ there are countless databases, such as biographical portals or the classic personal databases. These databases diverge considerably in alignment, structure and query options. The standardisation of data networking is correspondingly low; it concerns almost only the core of personal data (birth, origin, death) and links via identifiers such as VIAF (Virtual International Authority File) or GND (Gemeinsame Normdatei). A comprehensive or combined search, which would be a prerequisite for research on Contextualized Prosopography, is not yet possible. For example, research on social and geographical origin, education, professional positions, and the differentiated visualization of search results. The basic problem lies, on the one hand, in the hardly standardized ontologies of the databases and, on the other hand, in the technically limited query possibilities, often there is no API (Application Programming Interface). This is increasingly recognized as a disadvantage in historical research and hampers international research. The shortcomings identified apply in particular to ​ Contextualized Academic Prosopography​ . In this field of research, it is well known that people have been impressively mobile in Europe since ancient times and have visited the universities founded in Europe in large numbers since the Middle Ages, exchanging specialist knowledge intensively in their international networks. However, this historical situation is only described descriptively in the databases of the individual projects and could only be analysed through structured and systematic data networking. This would make the scholars as carriers of knowledge visible in their entirety for the first time, for example in their often groundbreaking function as impulse actors of pre-modern society. It would also become possible to observe knowledge circulation and mapping knowledge exchange for the European area, above all by incorporating further databases, whereby previously unknown or little-known connections (networks) between scholars and knowledge spaces could become clearer - an archeology of European knowledge. Such an explorative data analysis could lead to new or overarching questions, or maybe already answering them. The networking needs of existing databases of historical research are very high, especially in the field of ​ Contextualized Digital Prosopography​, which today no longer relies solely on personal data, but digitally reconstructs knowledge biographies and spaces. In one area of this research, ​ Contextualized Academic Prosopography​, however, despite international initiatives, it has not yet been possible to harmonise the numerous databases available in Europe and to make them available to research and the public in an overarching search with exploratory objectives. The approach pursued here for networking and harmonizing research data is to make these processes more effective and centralized by a manageable focus group with few similarly structured databases is formed, however, the structures of these databases on the local project servers are not yet changed, but only by means of ​ Dynamic Data Ingestion​ according to jointly defined ontology on the central server.

The main function of the DDI module is that the user (researcher) can determine which database field of a data source is stored in which field of the central database. Once the researcher has made these assignments (and the data structure of a data souce is not changed), data from very different sources can be dynamically collected in a central database following the principle of a 'spider'. Dynamic and not static, because the DDI Module allows the mapping of the database fields to be adjusted easily and flexibly at any time. This newly created data pool will be then searchable according to the principles of full text, categories, keywords, time and space. The server also functions as a buffer for all data so that the project databases do not have to be queried live, which is not recommended due to possible down- and / or longer latency times of the external project servers. In principle, the data sources / projects involved only have to fulfil a few requirements. They must publish their data documented, structured and permanently as Linked Open Data via API, preferably in JSON-LD (JavaScript Object Notation for Linked Data) and according to the FAIR data principles. Furthermore, the participating projects must commit themselves to a common ontology, for example like the established standard CIDOC (Conceptual Reference Model), which can be, for historical research, extended. The presentation will also show to what extent the DDI module can be used with and for the ​ Heloise Common Research model​.

Nov, 11th: Museum of the Printing Arts Leipzig, Nonnenstraße 38, 04229 Leipzig

17:00 - 21:00 Culture and Scientific Collaboration Event

Leipzig is a city with a rich tradition of bookmaking and publishing. The Museum of the Printing Arts Leipzig keeps these traditions alive.

The outstanding feature of this museum is that all appliances, tools and machines are not presented as mute testimonies to their time, but as vivid, working demonstrations of a wide range of techniques. Hands-on experimentation plays a major role, making the museum ideal as a platform for courses and workshops.

During the event we will have an guided tour at museum in English and a practical workshop on using a printing press. The evening ends with a buffet (by Rasselbock Catering) in the workshop hall between the machines.

Nov, 12th: Bibliotheca Albertina, Beethovenstraße 6, 04107 Leipzig

09:00 - 10:30 Session 4

Susanne Arndt Collaborative Terminology Work in Mobility and Transport Research - what do we need? (TIB - Leibniz Information Center for Science and Technology, Germany) show abstract

The Specialized Information Service for Mobility and Transport Research (Fachinformationsdienst Mobilitäts- und Verkehrsforschung: FID move) provides scientists from various fields, who are concerned with mobility and transport issues, with services for various research activities: it offers a map of researchers in all of Europe, provides access to domain-specific research items, supports scientists in the field of research data (management), and gives advice and a platform for making open access contributions. Satisfying the information needs of domain specialists, however, requires a deep subject indexing of each resource. This is possible by using domain-specific vocabularies which do exist but are not as conceptually deep and rich as required. The FID move project therefore follows an approach of vocabulary re-use and interlinking as well as enrichment and development of vocabularies. The resulting multi-resource vocabulary shall not only aid FID move services but shal l also be a central terminology service for scientists. As such it requires possibilities for participation and discussion by domain-experts who are not necessarily familiar with terminology work or semantic web standards. At the same time, it needs to provide mechanisms for quality control in this community-based terminology workflow. The talk will present several theses stating what is needed to establish collaborative terminology work among libraries and mobility and transport researchers.

André Valdestilhas More complete result set retrieval from large heterogeneous RDF sources (University of Leipzig, Germany)

11:00 - 12:30 Session 5

Natanael Arndt Semantic Web–Collaboration and Tools (Institut für Angewandte Informatik e.V., Germany) show abstract

Collaboration and exchange is a key factor for the success of the Web. Currently the Web is evolving to a Web of Linked Data, which comes with a necessity and trend for re-decentralization. In this talk I will give a short introduction to the vision of the Semantic Web and will present some tools to support collaboration and exchange of information.

Stefania Zucchini Research and data sharing's strategies in the Onomasticon Database (Università degli Studi di Perugia, Italy) show abstract

The speech concerns the choices made in the Onomasticon database regarding the definition and presentation of the data. We will also consider the possibility of sharing our data in a broader project which other research groups.

The speech will begin with a brief presentation of the Database Onomasticon, aimed to illustrate its chronological and geographical coordinates, overall structure and general purposes. Then, the different kinds of information will be illustrated, with reference to the extremely heterogeneous documentary and bibliographic sources from which they come.

Documentary sources include, for example, the students’ and / or teachers’ matricula, records of faculty salaries, the decisions of the city councils about university and individual professors (Perugia Studium was born as a municipal university, managed for a long time by the common citizenship), degree diplomas, registers relating to the current administration of different student colleges, acts of professional jurists and doctors’ corporations.

In consideration of the strong heterogeneity of data, different also on a graphic level, the research group had to make a series of choices regarding:

In general, it was considered appropriate to use a main onomastic form accompanied by as many alias as attested graphic and linguistic variables existed. We have opted for the standardized Latin for personal names (using the vernacular only when there is no other attestation, not even in historiography); Italian was chosen for the disciplines and teachings as well as for all the free text fields in the DB, and then for places name we chose the national languages, together with the GPS coordinates.

The presence of alias and the decision to insert a limited number of categories of information, together with the definition of specific search keys, was intended to facilitate the search inside the DB and the possibility of data exchange. In conclusion, we opted for a reduction of information, categorized within a deliberately limited ontology, in order to favour the automatic processing of data and consequently an overall reading of the available serial information, from multiple points of view and through different forms (for example with graphs and maps). The ultimate aim is to incorporate the traditional bio-prosopographic approach into a choral dimension.

Although the space-time coordinates of the DB are very small, the data standardization has encountered some obstacles, deriving from the heterogeneity of the sources mentioned above, but also from the difficulty of working on data numerically not very significant (especially in the case of persons not otherwise known) or generic (i.e. a generic indication of origin, ‘Alamannus’, instead of the place of origin) or still incomplete.

The last part of the speech will be dedicated to these difficulties and to the strategies implemented or designed to overcome them, in the hopes of contributing food for thought on the collective work that has been involved at length in attempting to establish a connection among our Databases.

13:30 - 15:00 Session 6

Yannis Delmas The Atlas Historique de la Nouvelle Aquitaine (Université de Poitiers, France) show abstract

The Repertorium Academicum Pictaviense should participate, with other databases, to the Atlas Historique de la Nouvelle Aquitaine, an ongoing project. We will present the new factoid datastructure, derived from CIDOC-CRM. Here the database would be linked to a document repository and geographical mapings.

Stefan Kühne, Ulrich Johannes Schneider and Thomas Riechert Closing Session

Accepted Papers

Nov, 13th: Hochschule für Technik, Wirtschaft und Kultur, Gustav-Freytag-Str. 42, 04277 Leipzig

09:00 - 11:00 Héloïse Advisory Board Meeting