In October and November, 2016, many researchers from InterPARES Trust took the opportunity to read and comment on “Records in Context”, the draft description standard put forward by the Experts Group on Archival Description (EGAD) committee, formed by the ICA Programme Commission in late 2012.
Records in Context (RiC) has been presented by EGAD as a conceptual model for archival description, based on four current ICA descriptive standards, and employing formal information modeling techniques.
The comments that follow are compiled from individual assessments of RiC from among the InterPARES Trust international research teams, and approved by the regional team directors for submission to EGAD.
The comments are organized as follows:
- General comments about the process of development (covering foundation, language, transparency, inclusivity)
- General comments about modeling and methodology (covering methodology, role of archivist, role of ontology, implementation, and users)
- Concluding remarks
- Appendix – Selected comments on specific clauses and references
You can download these comments as a PDF.
We would like to begin by recognizing the enormous amount of work done to date, and thanking EGAD and the ICA for undertaking this difficult and important task. We offer these comments in a spirit of constructive collaboration.
To begin, we believe that the work on Records in Context (RiC) was not communicated sufficiently with the archival community during its earlier phases of development. In consequence, the first contact with the standard for most archivists is with a mature draft, developed by EGAD over the course of two years. Timely and more frequent presentations to the broader archival community would have facilitated development to this point and it is hoped that further consultation will help the process of continued development that is initiated by publishing this draft of the standard. Furthermore, it is difficult to deliver comprehensive comments within three months on a substantially complete product. Nor is it encouraging to ask for comments on something that is nearly complete – this is akin to building a house and then asking what should be changed.
2. General comments about the process of development of the model/standard
2.1 Use of other standards
RiC-CM is presented as having incorporated the four existing ICA description standards, ISAD(G), ISAAR(CPF), ISDF, and ISDIAH. However, this initiative started with no analysis of both the actual level of application of the ICA standards in different countries and their major criticalities. This is a crucial factor in a decision whether to integrate a particular standard in the new model, and how it should be integrated. For example, ISDIAH is nearly unknown in the world. Therefore, it would be advisable to evaluate the meaning of such a fact in the development of a new model or standard. Similarly, the other three standards are not widely adopted in many countries—why are they not? Is this indicative of the use of ICA standards in general? Or is it due to the presence and adoption of some national standard? Or is it a generic difficulty in adopting a standard model? Is there something that can be tweaked in the ICA standards to support their adoption? If so, what would that be? RiC-CM does not consider any of these preliminary questions, as it takes for granted (to its detriment) that the model must build on the four existing standards.
2.2 Inclusivity and transparency of process
RIC-CM has been developed without significant input from Africa and Asia (there is no Asian representative, and a single member from Ivory Coast representing the whole of Africa). Also, there is an evident imbalance in the presence of the different countries: there are many representatives from Europe, and some countries are represented by more than one participant. For example, there are two representatives from Italy, two from the United Kingdom, two from Australia, two from Spain, and two from the USA. As a result, there are 20 members representing 13 countries. Do the countries represented by two members have two votes when decisions are made?
The criteria for selecting the different members of EGAD have not been published and are not clear. Neither is it clear why representatives from different continents have not been involved on an equal base. The development process of RiC-CM appears, therefore, to be objectively neither transparent nor fair. This poses a major challenge for RiC-CM to be recognized as a standard, since it lacks some fundamental features of any standardization process.
In RiC-CM it is stated that “once the model is stable, it will be translated into French and other languages.” We understand that French has been mentioned explicitly because French and English are the working languages adopted by ICA, so this is a default statement. However, RiC–CM aims at becoming a professional standard that should be adopted worldwide, so the focus on French appears neither necessary nor adequate. The language most spoken in the world is Mandarin (more than 900 million people, about 14% of the world population); the second language is Spanish (about 400 million people, nearly 6% of the world population). Then English, Hindi, Arabic, Portuguese, Bengali, Russian, Japanese, and Punjabi follow. French is not in the list of the first ten languages, it comes after German and Korean—the number of French native speakers is about 1% of the world population. Figures are not precise, but their meaning is very clear.
2.4 Model or standard?
There seems to be some confusion as to whether this is a conceptual model or a standard for archival description. The long story on the value of records at the beginning of the text is nice, but maybe too long. The role of standards like this is not to convince people how records are valuable, but how to present these values through archival description. In a similar vein, context is important, but the part of the standard dealing with contextual entities may be over stressed. This part of the specification could be made more succinct, as has been done with ‘record’ entities (record, record set, record component).
3. General comments about modeling and methodology
In general, the development of any project, product or service should be carried out according to the well-known Deming cycle (PDCA, i.e., Plan, Do, Check, Act). Given the ISO records management standards (30300/30301 and hence in principle 15489) adopt the Deming PDCA approach it is extremely disappointing that the approach has not been adopted here. The initiative carried out by EGAD presents a serious methodological issue, since there has been no Check action before proceeding with the Act phase. The result is that EGAD risks reconciling, integrating, and building on concepts and models that themselves have critical issues, or that are not used. As a consequence, RiC-CM may perhaps look new, but it will embed the same old problems associated to the standards assumed as its basis.
3.2 Role of archivist as an entity
RiC-CM does not consider ‘archivist’ as a core descriptive entity, whereas ‘archivist’ is THE descriptive entity by definition, that is, the entity/subject who describes the object under analysis. In archival description, statements about entities, properties, and relationships are assertions made by archivists, and all such statements should be represented as assertions made by named individuals in specified contexts, not as autonomous or context-free facts.
This is not a problem related to the draft status of the document, that is, related to the fact that EGAD did not have the time to analyze this specific dimension—the document explicitly says that EGAD will work on this aspect as a next step. This is a methodological problem. In fact, RiC-CM appears totally unaware of the international debate – raised by postmodernist voices and then discussed by the broader community – on the role of archivists as mediators, and on the value of their professional action, supporting and guaranteeing the authenticity of records on the one side, shaping the cultural memory hence the identity of communities on the other side. In other words, it seems impossible to define a model of descriptive elements if a model of the archivist’s role is not defined. Instead, EGAD considers this as an action that can be postponed.
It bears stating also that description is not always the work of archivists. Increasingly, in interactive online environments users are asked to contribute and descriptions need to accommodate multiple perspectives. When two or more users (or indeed two or more archivists) examine the same entity, they are likely to view it in different ways and see different relationships between it and other entities. But there appears to be nothing in the standard that would support representation of different and possibly conflicting viewpoints; indeed, EGAD does not reflect the movement toward user participation. Nor does it allow for describing situations of uncertainty; in practice, describers cannot always identify entities, relationships, etc. with total certainty. It is important that the standard takes account of this. These issues need to be addressed from the beginning and cannot be satisfactorily added at a later stage.
3.3 The role of an ontology and development of RiC-O
We believe that the foundations of the conceptual model are seriously flawed due to the fact that the members of EGAD chose to jump to developing a conceptual model, without first developing an ontology or referencing some pre-existing upper level ontology, that is, an explicit formal representation of a domain and the relationships within it (e.g., Bunge or Searle (or both); see, Lemieux, Victoria L. “Toward a ’Third Order’ Archival Interface: Research Notes on Some Theoretical and Practical Implications of Visual Explorations in the Canadian Context of Financial Electronic Records.” Archivaria 78 (2014)). This type of ontology is a necessary precondition to a clearly specified conceptual model with ontology as a technical artefact – i.e., a representation in RDF OWL. Some further reference to the literature on knowledge representation theory, ontology theory, and semantic web would likely help clarify the authors’ understanding and lead to greater clarity in their model.
As a result of having no upper level ontological anchor, it is not clear why certain things are first order ontological ‘entities’ and other things are mere ‘properties’ of entities. The authors also do not clearly differentiate between the record (or record sets) and what they represent (ie., functions, activities, etc) and what is represented about them (i.e, archival description). This surely must be a foundational distinction, as it is important not to muddle up the thing itself from our description of it, even while recognizing that on some level the description of the thing may so fundamentally alter the identity of the thing as to give rise to a new ontological thing. For a discussion on how this can be done see, Lemieux, Victoria, and Lior Limonad. “What ‘good’ looks like: understanding records ontologically in the context of the global financial crisis.” Journal of Information Science 37.1 (2011): 2939.
We take issue with the suggestion that hierarchies need to be replaced with graph-based representations. Hierarchies, networks, and matrices are all types of graphs that can be expressed mathematically using the same formulation as follows: G=(V,E), representing the fact that a graph comprises vertices (nodes) and edges (links), which may be visually represented as nodelink (network) diagrams, hierarchies (or trees), or matrices. In other words, hierarchies are already graph-based, which we can transform visually from hierarchies into networks, while preserving their underlying structural semantics as graphs.
Finally, we mention that the document does not mention explicitly other semantic models explicitly e.g., PROV-O, which is being used to represent provenance information in the research data community for example, even though it does mention the need to describe archival material in relation to other systems of description developed in libraries, museums, etc. We wonder if such an apparently ‘isolationist’ stance is good for the archival profession.
3.4 Comments about implementation of the model
We express a fundamental concern that the EGAD group is basing the conceptual model on technologies (graph databases) that are not well known or understood by the implementation communities that will rely on the standard. It has already been very difficult to implement the much better-understood relational database technologies (which have a very large developer pool) for existing standards.
In short we are concerned that the group has not given much thought to how the standard could actually be implemented either as a descriptive or access technology, given the likely resources that are/will be available for that task. To that end we believe that implementation of description, migration of existing data and formatting to finding aids should be suggested, prototyped and tested before the process ends.
3.5 Comments about users of the model
“Finally, RiC is intended to be of interest to the research users of archives, in particular to scholars interested in reusing archival records. Though RiC primarily focuses on description that is based on archival principles and responsibilities, it may be used to support scholarly descriptions of individual records or sets of records that are based on other perspectives and requirements.”
Except for the above lines (found on pp. 2-3), users are not present at all in RiC-CM. Description serves two user groups: the archivists, as managers of the records, and patrons as consumers of the content of the records. To date, archivists have largely used one tool for both purposes (although the accessioning database is often an internal management tool not accessible to the public, but systems like AtoM integrate accession and descriptive systems).
The role of users has increasingly been a subject of investigation in the scientific literature of these past years. New technologies offer new and unimagined possibilities of interaction with finding aids, suggesting the need to reconsider and redefine the role of finding aids on the one side, and the role of users on the other side. Users must be a primary consideration of any project dealing with description. This focus on users should be preliminary to any definition of description elements. Without a thorough analysis and understanding of the targeted audience – that is, the nature and characteristics of the audience – the model would be, inevitably, inaccurate, if not completely wrong.
Also, it must be noted that the focus on users’ roles is one of the major issues in the scientific literature of these past years, so one would expect that – if not for technical reasons – EGAD would consider this dimension to show awareness of the scientific literature and to produce an up-to-date document. This is not something that can be postponed. It should be embedded in the model as a preliminary and foundational step. As an example in this respect, the IFLA FRBR LRM (Library Reference Model) devotes a whole section to users (Chapter 3: Users and User Tasks), where user tasks are clearly identified (Find, Identify, Select, Obtain, Explore). We are not suggesting here that EGAD adopt this same categorization. We merely highlight that this is the approach that should be adopted – to identify users and their roles before modeling classes, properties, relations and such.
4. Concluding remarks
In short, we find that RiC-CM is weak as a model, in that it neither defines the structures it uses (entity, property, relation) nor provides a rationale for their use. A conceptual model should identify and define the fundamental bricks used to build the model. If the difference between an entity and a property is not relevant, introducing and using such bricks is not only useless but also misleading. One may wonder why not use a single category, say, Information Element.
Ultimately, the document fails to adequately address a model for discovery of archival resources, a model that accommodates multiple users and uses. See, for
example, Charles Ami Cutter’s object and means for a bibliographic catalog. He states in abstract terms the purpose of the catalog (it’s object[ive]s, a strategy for discovering books). Then he moves to tactics (means) to achieve those purposes. Although EGAD presents RiC as a conceptual model, and therefore technology-agnostic, RDF (which may be a very powerful and useful tool) is the environment that nurtures it. But, without a clear strategy – or tactics – as to how that tool should be used, it is of limited value. Perhaps EGAD has assumed these strategies were commonly accepted and understood by the professional community. However, that has not been our experience.
EGAD and ICA should re-start the development process on a new, transparent and fair basis, publishing the criteria for selecting the countries and their representatives, and making a public call for participation. From that point, much of the work done can be saved, but it has to be the outcome of a fair process, starting with a clean slate.
This is an opportunity for a radical change, and the adoption of a different attitude towards standards development. Standards should be the result of a transparent and inclusive process. The ICA could show the international community that there is a different approach, a different way of leading these processes. This may encourage the many people and groups who are unsatisfied with the status quo methods of designing what is supposed to be a professional standard to speak out, to participate. Some critical voices have already been raised. We are raising our voices too, and we will encourage professional associations and groups to ask for a fair and transparent process aimed at developing a new professional standard.
Thank you for this opportunity to comment. We hope that you find our comments helpful, and that they will be addressed.
5. Appendix: Selected comments on specific clauses or references
Page 11: “Description of the Records contained in a Record Set is further differentiated into two categories: summary description of the contained Records (for example, a date range for the span of time within which the contained Records were created), and the shared properties or relations the Records have that designate them as members of a Record Set (for example, all contained Records document the same Function, or all share the same Documentary Form). […] The summary properties are not properties of the contained Records as such, but an overview of them, reduced to an abstract. The shared properties or relations recorded at the level of the Record Set, however, are legitimately properties or relations of each of the member Records of a Record Set.”
1) Editorial note: here and elsewhere the document uses the term “property” in place of “value”, which is a clear mistake. It is not the properties that are shared, it is their values. Properties are categories—they are shared by definition, if they can be applied to all individuals of a class.
2) The above distinction (summary properties and shared properties) is not completely consistent with the list of properties presented in the subsequent pages. In fact, section 3.4 lists the “Properties of Record Set” (page 26-28), section 3.5 (page 28-29) lists the “Properties Summarizing the Members of a Record Set”, and section 3.6 (page 29-30) lists the “Properties Shared by All Member Records of a Record Set”. It is not clear what the nature of the properties listed in 3.4 is. Such properties are: P22 Authenticity and Integrity Note, P23 Type, P24 Accrual Note, P25 Accrual Status, P26 Arrangement, P27 Classification, P28 History. Therefore, they seem to be by all means “properties summarizing the members of a record set”, yet they are put in a different section—they are neither summary properties nor shared properties.
3) The above distinction is not very clear. Summary properties are presented as being an overview of all the Records included in the Record Set, whereas shared properties have the same value for both the Record Set and its Records. However, summary properties too may have the same value for both Records and Record Set—for example, P31 Scope and content is defined as a summary property of the Record Set. Nonetheless, its value may well be the same for all Records in the Record Set. This would be indeed a peculiar situation, but the general question is: why differentiate between shared and summary properties? why not just simply list the properties? RiC-CM does not say a word on the rationale for such distinction. Perhaps the intention for such a distinction is to identify those properties whose value is inherited by lower levels. In case, rather than considering a property as summary or shared per se, it is much more simple and useful to eliminate any distinction and create one single property – say, Px Shared – to identify those properties whose value holds for both the Record Set and its Records. This way, for example, P31 Scope and content would describe the scope and content of the Record Set. If its value is the same for all the Records in the Record Set, the boolean property Px Shared may be set to Yes, so that we all would know that such value holds for all Records in the Record Set.
4) This distinction gives raise to a further issue as to the lack of summary properties for describing some aspects of the Record Set.
Page 31: “Additional Property Specific to Person and Person Assumed Identity”.
Person is the value of a property, not an entity. Therefore, the model establishes that certain properties apply only when another property (P32 Type) assumes a specific value—this is a bit baroque. This comment is related to the one above—it is not clear why Person is modelled as a property rather than as an entity. Modelling Person as an entity seems the straight and effective choice—for example, the IFLA FRBR LRM (Library Reference Model) models Person and Collective Agent as subclasses of Agent.
We are not arguing that modeling Person as a property is wrong. We are suggesting that such a choice should have a rationale, and such rationale may perhaps be found in the definitions of the structures used to build the model. In the absence of these definitions, everything looks very vague and ambiguous.
The properties of Record (page 22-26) are categorized into four categories: Content, Representation, Carrier, Management and use. Such categories have not been either defined or presented in the document, so they need to be presented at least. However, they are not really needed, so they may also be simply deleted.
Definition of “record”
Page 13: “E1 Record: Linguistic, symbolic, or graphic information represented in any persistent form, on any durable carrier, by any method, by an Agent in the course of life or work events and Activities.”
• “Linguistic, symbolic, or graphic”. These categories are not disjoint. Assuming that “linguistic” is used to mean “textual”, any text is written using symbols that can be alphabetic symbols, ideograms, pictograms, etc. In short, any textual representation is a symbolic representation.
• “Linguistic, symbolic, or graphic”. These categories do not cover all possibilities—what about audio records? They convey information that is neither linguistic nor symbolic nor graphic.
• “Linguistic, symbolic, or graphic”. The way it is presented, it seems that a record should belong to one of those categories, whereas it may belong to all of them at the same time.
• “Linguistic information” sounds like “information about linguistic”. Information is just information—something abstract that needs to be expressed in some way to be conveyed. It is the expression/representation that is “linguistic, symbolic or graphic” (to use the categories of RiC-CM). So it should be phrased: “Information represented in a textual, symbolic or graphic way”.
• This definition does not clarify whether the information is the one on the carrier or the one meant to be conveyed. In other words, any digital file (be it a picture, a text, a sound or whatever) may always be considered symbolic information—rather, information represented in a symbolic way – because at the ground level it is a sequence of bits. However, we may also say that a GIF file is information represented in a graphic way, because further decoding of the file from the bit level produces an image. RiC-CM is a conceptual model, so it should address this issue and clarify explicitly what “representation” means.
• “information represented in any persistent form, on any durable carrier”. It is not clear what the difference between persistent form and durable carrier is. If such a difference exists, it should be explicitly stated. If it does not, either persistent form or durable carrier should be dropped.
• “by any method”. In general, definitions should be kept to the minimal level needed to identify the related thing/concept, since any redundant addiction simply generates noise. If we say that “books are things that can be read” we are defining a thing; if we say that “books are things that can be read by anyone” we are not adding anything—in the absence of further attributes/properties, we already assume that books can be read by anyone. As long as it is “a thing that can be read”, that is a book according to the definition. Coming to RiC-CM, “by any method” does not add anything, rather, the user may ask what a method is. Since it is useless and generates useless noise, “by any method” should be dropped.
• “in the course of life or work events and Activities”. Same as above: this is useless and generate useless noise. Also, the categories of life on one side, and work events and activities on the other side, are not disjoint, so the whole thing sounds inaccurate. More precisely:
• if the Agent is an individual or a group of people, they obviously can act only “in the course of life” so the refinement is useless;
• if the Agent is an organization, it acts “in the course of […] work events and activities”, so the refinement is useless;
• if the Agent is a piece of software (delegate-agent), it acts “in the course of life or work events and activities”, so the refinement is useless.
As a consequence of the above considerations, the definition of record would be something like: “Information represented by an Agent [in some way], in a persistent form”, which is actually the essence of the definition in RiC-CM. This leads to a final consideration: RiC-CM is a model for archival description. There is a huge amount of literature on foundational concepts like information, document, and record. The definition adopted by RiC-CM may well be a step further towards a broader and shared understanding of the concept of record. However, it is surprising that a conceptual model created by archivists does not consider at all this issue, and does not highlight this shift from a traditional definition towards a definition where the boundaries between information, document and records are blurred. Also, it should be noted that concepts and definitions may be refined indeed in order to make them applicable to a wide variety of contexts, but perhaps it is possible to build on the theoretical reflection of the past years and build on it, rather than subtracting more and more from the traditional definitions, to the point that they become nearly void.
Page 31: P36 Gender
The presence of this property in RiC-CM, along with the values it can assume, is very surprising. In fact, while sex refers to a biological dimension, gender refers to a socio-relational dimension including but not limited to sex. It is not very clear how it is possible from a third party to identify the gender of someone else. However, this is not the major issue. The issue is that P36 Gender is a controlled term, and the values it can assume are “male”, “female”, and “unknown”. As a matter of fact, there is something different from male and female, rather, there are nuances beyond male and female, and they have an identity that cannot be called unknown—it is simply “other” than “male” or “female”. Archivists should be aware of the complexity lying behind the description of such a dimension, so it is a bit surprising to see the little care has been put in it, especially when compared to other models that are not supposed to have the depth of an archival model. For example, the FOAF (Friend-of-a-Friend) model uses the Gender property too, but 1) it is a testing property, 2) there are a lot of caveats, and 3) “the value is typically but not necessarily ‘male’ or ‘female’. Values other than ‘male’ and ‘female’ may be used, but are not enumerated here.” It may be worth looking at their long and nuanced description of Gender: (http://xmlns.com/foaf/spec/#term_gender).
Page 29: “Properties Shared by All Member Records of a Record Set. Editor’s note: The following properties, which have the same definitions as those given for Record, may be used when they are shared by all member Records of a Record Set:
• P6 Content Type
• P10 Encoding Format
• P11 Language Information
• P12 Media Type
• P13 Production Technique
• P14 Medium
• P17 [sic] Conditions of Access
• P18 [sic] Conditions of Use
• P20 [sic] Record State.”
1) RiC-CM states that these properties “may be used when they are shared by all member Records”. However, it is not clear how to handle the case in which these properties are not shared by all members—in fact, there is no other property that can be used. For example, a file may be freely accessible except for one record. In this case, it is not possible to use the property P18 Conditions of access, since this property is not shared by all member records. Where to put then information on Conditions of access of the Record Set (i.e., the file)? The same for the language or media type, and all other so-called shared properties—where to put such information, when the values are not shared at the Record level?
2) Vice versa, P8 Quality of information e P16 Physical Characteristics are the only two properties of Record that cannot be used for Record Set. It is not clear why RiC-CM does not allow a Record Set to have all its Records with the same value of P8 or P16, so that P8 and P16 can be applied to the Record Set as whole.
3) In general, the issues highlighted here once again show that the distinction between summary and shared properties does seem more to raise issues than to solve them.
Page 21: “Properties of entities”
It is not clear why some things have been identified as entities and some others as properties—for example, why is the name of an entity a property (P3 Name) rather than an entity itself, which seems a less functional solution? Global and local identifiers may be defined as entities and linked through some relation Rx Identifying, so to use only entities and relations, rather than introducing properties.
We have been told by some members of EGAD that what is defined as a property in the model may be serialized as a class, so the distinction between property and entity is not relevant. If confirmed, this raises a more general issue.
RiC-CM does not provide any explanation of the concepts of entity, property and relation. For example, another relevant conceptual model in our domain, CIDOC CRM, devotes quite some space to explain – among others – the concepts of class (that is, entity in RiC-CM) and property (that is, relation in RiC-CM). En passant, it is interesting to note that such conceptual model does not make use of properties (in the RiC-CM sense).
5.3 Use of present versus past tense
Page 39: “Almost all relations are expressed in present tense, which permits describing either permanent traits or current situations. However, describing archival context often requires quoting past situations and non-permanent traits. For example, ‘J.F. Kennedy occupies position president of the United States of America’ is not currently true; so this relation should be stated using the past tense: ‘J.F. Kennedy occupied position president of the United States of America’. Also, ‘Record is held by Agent’ refers to a Record in the current custody of an archives, while ‘Record was held by Agent’ means that the Record was held, only at a certain time, in the custody of an Agent.”
This solution determines the need for reviewing descriptions continuously. For example, if a record is passed to a different agent, the statement “Record is held by X” must be changed to “Record is held by Y”, and a new statement “Record was held by X” must be added. This is not only a bit baroque, but also useless, since there is another RiC-CM construct that is perfectly adequate to manage these situations. In fact, P68 Date is a property associated to any relation in the model. It identifies “the Date (range or time) when there is a relation between two entities.” Therefore, there is no need to use past tenses, that is, different properties to describe past situations, hence creating an overload of relations. All properties should be expressed with present tenses. Their contextualization in time is provided by the relation P68 Date.
5.4 Use of controlled vocabulary in describing properties
The entities E5 Occupation, E6 Position, E7 Function, E9 Activity, E10 Mandate, E11 Documentary form have all three properties: Px Type, Py Description, Pz History. The Px Type properties have all the same semantic, yet they are different, that is, they have different identifiers. The same for the Py Description properties, and Pz History properties. In other words, there is a property P43 Description to describe Occupation, a property P46 Description to describe Position, a property P49 Description to describe Function, and so on.
Such an approach perhaps may have sense when a property assumes controlled values, and those values depend upon the entity that is in play. This is the case of Type (P42, P45, P48, etc) whose datatype is Controlled Term. However, the datatype of both Description and History is Text, so there is no reason to create so many different properties—it is enough to create a property Description and a property History, and establish that they can be applied to the entities Occupation, Position, etc. Of course, this means that the definitions of both Description and History should be broad enough to be applicable to the different entities. This would simplify the model considerably.
Same comment as above, related to the entities E1 Record and E3 Record Set. The following properties are defined the same, and their datatypes is text (except for P17/P27 whose definitions are nearly identical, and datatypes are textual/controlled):
• P5 Authenticity & Integrity Note (Record) // P22 Authenticity & Integrity Note (Record Set)
• P7 Content Extent (Record) // P29 Content Extent (Record Set)
• P9 Scope & Content (Record) // P31 Scope & Content (Record Set)
• P15 Physical & Logical Extent (Record) // P30 Physical & Logical Extent (Record Set)
• P17 Classification (Record) // P27 Classification (Record Set)
• P20 History (Record) // P28 History (Record Set)