Re-expression of requirements using uniform template: Section B

-- DavidGiaretta - 11 Aug 2008 - r73 - proposed changes to clarify Preservation Policies - Preservation Strategic Plans - Preservation Implementation Plans

To fill in the template for a requirement, simply click on the "Edit" tab at the top of the page and type text to replace the "..." where they appear under each requirement. When you have finished, click on "Save" at the bottom of the edit page.

B1. Ingest: acquisition of content

Restructured B1 with sub-requirements

B1.1 Repository identifies properties or information content it will preserve for digital objects.

Supporting Text
The repository must define explicitly what properties (e.g. text layout) or information content (e.g. the units of the measurement of data fields) which must be preserved over the long term.

This is necessary in order to make it clear to funders, depositors and users what the responsibilities the repository is taking on and what aspects are excluded. It is also a necessary step in defining the information which is needed from the information producers or depositors.

-- MarkConrad - 14 Mar 2008 There appears to be a word missing from the sentence above.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Mission statement; submission agreements/deposit agreements/deeds of gift; workflow and Preservation Ppolicy documents, including written definition of properties as agreed in the deposit agreement/deed of gift; written processing procedures; documentation of properties to be preserved.

Discussion
This process begins in general with the repository's mission statement and may be further specified in pre-accessioning agreements with producers or depositors (e.g., producer-archive agreements) and made very specific in deposit or transfer agreements for specific digital objects and their related documentation. For example, one repository may only commit to preserving the textual content of a document and not its exact appearance on a screen. Another may wish to preserve the exact appearance and layout of textual documents, while others may choose to keep the units of the measurement of data fields and to normalize the data during the ingest process. If unique identifiers are associated with digital objects before ingest, they may also be significant properties that need to be preserved.

- BarbaraSierman - 17 Mar 2008 does this also include "original metadata"for example the unique identifier? If so, should this not be stated more explicitly in the Supporting text. Or is the example of the unique identifier better in place in B1.2?

B1.2 Repository clearly specifies the information that needs to be associated with digital material at the time of its deposit (i.e., SIP).

Supporting Text
The repository must explicitly specify what information is needed from the content provider.

This is necessary in order to ensure that adequate information is collected in a timely way in order to produce an AIP, and in particular that the content provider is clear what must be provided. The explicit specification allows the workflow of the repository to be checked and validated.

-- MarkConrad - 14 Mar 2008 I do not understand the last sentence above. How does the specification of associated information allow one to validate the workflow of the repository?

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Transfer requirements; producer-archive agreements. Work-flow plans to produce the AIP.

Discussion
Note that the depositor may be a harvesting process created by the repository.

For most types of digital objects to be ingested, the repository should have written criteria, prepared by the repository on its own or in conjunction with other parties, that specify exactly what digital object(s) are transferred, what documentation is associated with the object(s), and any restrictions on access, whether technical, regulatory, or donor-imposed.

Note that the depositor may be a harvesting process created by the repository.

The level of precision in these specifications will vary with the nature of the repository's collection policy and its relationship with creators. For instance, repositories engaged in Web harvesting, or those that rescue digital materials long after their creators have abandoned them, cannot impose conditions on the creators of material, since they are not "depositors" in the usual sense of the word. But Web harvesters can, for instance, decide which metadata elements from the HTTP transactions that captured a site are to be preserved along with the site's files, and this still constitutes "information associated with the digital material." They may also choose to record the information or decisions -whether taken by humans or by automated algorithms- that led to the site being captured.

The repository can check what it receives from the producer based on the specifications.

B1.3 Repository has mechanisms to authenticatevalidate the source of all materials.

Supporting Text
The repository must ensure that the sources of the materials it intends to preserve are who/what they claim to be.

This is necessary in order to avoid providing erroneous provenance to the information which is preserved.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Submission agreements/deposit agreements/deeds of gift; workflow documents; evidence of appropriate technological measures; logs from procedures and authentications, legally binding submission agreements/deposit agreements/deeds of gift

--- BarbaraSierman - 17 Mar 2008: How can a submission agreement demonstrate the source of the material? The logs are more convincing. There is a difference between the managerial decision to archive material ( submission agreement) and the actual receipt of that material. I would assume that it is the last situation that is meant in B1.3.

Discussion
The repository's written standard operating procedures and actual practices must ensure the digital objects are obtained from the expected source, that the appropriate provenance has been maintained prior to submission, and that the objects are the expected objects. Confirmation can use various means including, but not limited to, digital processing and data verification and validation, and through exchange of appropriate instrument of ownership (e.g., legally binding submission agreements/deposit agreement/deed of gift). Different repositories will adopt different levels of proof needed; the Designated Community should have the opportunity to review the evidence.

-- KatiaThomaz - 20 Mar 2008 - What does "source of all materials" really mean? A physical or corporate person responsible for issuing the materials?

B1.4 Repository’s ingest process verifies each submitted object (i.e., SIP) for completeness and correctness as specified in B1.2.

Supporting Text
The repository must verify, as far as it can, that each SIP is complete and correct.

This is necessary in order to detect and correct potential transmission errors between the depositor and the repository.

-- MarkConrad - 27 Mar 2008 I would remove the phrase, "such as network drop out or other corruption." This is only one potential type of transmission error. The discussion lists some others. Having this phrase in the "mandatory" section of the metric implies that we are only looking for this particular type of transmission error. -- JohnGarrett - 31 Mar 2008 I agree we should move it.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Evidence that the repository checks the information that needs to be associated with digital material at the time of its deposit against the SIP. Appropriate Preservation Policy and Preservation Implementation Planpolicy documents and system log files from system performing ingest procedure; formal or informal "acquisitions register" of files received during the transfer and ingest process; workflow, documentation of standard operating procedures, detailed procedures; definition of completeness and correctness, probably incorporated in Preservation Ppolicy documents.

-- JohnGarrett - 31 Mar 2008 I would move the policy and procedures items up in B1.2, which details what needs to be done. This is just the recording keeping that the policies and procedures were done.

Discussion
Information collected during the ingest process must be compared with information from some other source --the producer or the repository's own expectations--to verify the correctness of the data transfer and ingest process. The extent to which a repository can determine correctness will depend on what it knows about the SIP and what tools are available for verifying correctness. It can mean simply checking that file formats are what they claim to be (TIFF files are valid TIFF format, for instance), or can imply checking the content. This might involve human checking in some cases, such as confirming that the description of a picture matches the image.

Repositories should have established procedures for handling incomplete SIPs. These can range from rejecting the transfer, to suspending processing until the missing information is received, to simply reporting the errors. Similarly, the definition of "completeness" should be appropriate to a repository's activities. If an inventory of files was provided by a producer as part of pre-ingest negotiations, one would expect checks to be carried out against that inventory. But for some activities such as Web harvesting, "complete" may simply mean "whatever we could capture in the harvest session." Whatever checks are carried out must be consistent with the repository's own documented definition and understanding of completeness and correctness.

One thing that a repository might want to do is check for network drop out or other corruption during the transmission process.

%NOTE% B1.2 does not specify everything about completeness - only what needs to accompany the deposited info. %ENDNOTE% -- JohnGarrett - 31 Mar 2008 Should this note be here or in B1.2?

| Bruce Ambacher - Does this function check the SIP against the metadata? The Evidence discussion includes other aspects beyond those stated. |

B1.5 Repository obtains sufficient physical control over the digital objects to preserve them.

-- BarbaraSierman - 29 Mar 2008 What is meant by physical control? In the Supporting text it says that this control should guard against 3rd parties (in)-actions, related to bit preservation of the object. I would assume that in this case examples like fixity measures would be given. Instead, the discussion part speaks about "reference digital object". In this case I thought of for example research papers with a link to other sources. This is more about "the intellectual entity" and of what parts this consists, is that what is meant by physical control? If so, we should leave out the reference to the bit preservation of the object, as this might be confusing. But may be I understand it wrong, regarding the comments of Mark on the Discussion part -- JohnGarrett - 31 Mar 2008 I think the intention is more something like an email message with an attachment. A decision needs to be made if just the email message is the archived object or if it is the email message with the attachments. In that case, the archive might need to go to a separate directory and pick up the attachment also. -- RobertDowns - 31 Mar 2008 Perhaps possession or physical possession would help to reduce the ambiguity about the term, physical control.

Supporting Text
The repository must have adequate physical control over the bits which make up the digital objects.

This is necessary in order to ensure that the most basic type of preservation, namely bit preservation, is assured. This guards against the possibility of loss through the actions or inactions of a third party. **Note: We might want to come back to this. (First discussed 3/29/08 meeting)

-- MarkConrad - 27 Mar 2008 I do not understand what is meant by the previous sentence.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Submission agreements/deposit agreements/deeds of gift; workflow documents; system log files from the system performing ingest procedures; logs of files captured during Web harvesting.Agreements with third party storage ; Documents showing the level of physical control the repository actually has. A separate database/metadata catalog listing all of the digital objects in the repository and metadata sufficient to validate the integrity of those objects (file size, checksum, hash, location, number of copies, etc

Discussion
The repository must obtain complete control of the bits of the digital objects conveyed with each SIP. For example, some SIPs may only reference digital objects and in such cases the repository must get the referenced digital objects if they constitute part of the object that the repository has committed to conserve. This willIt is not always be the case that referenced digital objects are preserved.: scholarly papers in a repository may contain references to other papers that are held in a different repository, or not held anywhere at all, and harvested Web sites may contain references to material in the same site or different sites that the repository has chosen not to capture or was unable to capture. For example a decision needs to be made if just an email message is to be the preserved object or if it is the email message with the attachments. In the latter case, the repository might, for example, need to go to a separate directory and pick up the attachment also.

-- KatiaThomaz - 20 Mar 2008 - What does "physical control" really means? Controlling the boundaries of a digital object? Is it physical or logical control?

| Bruce Ambacher - Katia's question is very pertinent. Under Examples we should consider adding the crossed out examples to "documents . . .such as ..." In the Discussion the insert should read "It is not always the case" |

-- MarkConrad - 27 Mar 2008 I agree with Katia and Bruce. We have not defined what is meant by physical control. Nor have we specified any of the means used to gain bit-level physical control as I understand that phrase (e.g., a separate database/metadata catalog listing all of the digital objects in the repository and metadata sufficient to validate the integrity of those objects (file size, checksum, hash, location, number of copies, etc.))

-- JohnGarrett - 31 Mar 2008 I agree we need more discussion of the physical control that is needed. As this checklist item is currently written, we discuss mostly about determining what the extent of the archived item and not what needs to be done once we know what the item to be archived is.

B1.6 Repository provides producer/depositor with appropriate responses at predefinedagreed points during the ingest processes.

Supporting Text
The repository must provide responses to the producer/depositor at agreed points.

This is necessary in order to ensure that the producer can verify that there is no inadvertent lapses in communications which might otherwise allow loss of SIPs.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Submission agreements/deposit agreements/deeds of gift; workflow documentation; standard operating procedures; evidence of "reporting back".

Discussion
Based on the initial processing plan and agreement between the repository and the producer/depositor, the repository must provide the producer/depositor , if it is appropriate to have such a plan, with progress reports at agreedspecific, predetermined points throughout the ingest process. Responses can include initial ingest receipts, or receipts that confirm that the AIP has been created and stored. Repository responses can range from nothing at all to predetermined, periodic reports of the ingest completeness and correctness, error reports and any final transfer of custody document. Producers/Depositors can request further information on an ad hoc basis when the previously agreed upon reports are insufficient.

-- MarkConrad - 27 Mar 2008 If "[r]epository responses can range from nothing at all" (i.e., no response), why is this requirement included in the document?

-- BarbaraSierman - 29 Mar 2008 Mark, I thought this was because you are not always able to send a response to the producer, for example in case of websites. It would help if this was added to the phrase.

-- JohnGarrett - 31 Mar 2008 I think we should allow an agreement that no response is needed with the expectation that the producer could just check the archive holdings and determine if the sent items arrived, but in general will be an exception. Also I've thought of the archive itself or some department in the archive as the Producer for a web harvesting project. The Producer is the entity that is "producing" the SIP not necessarily the producer of the original data.

B1.7 Repository can demonstratehas written policies that indicate when it accepts preservation responsibilityformally accepted for the contents of theeach set of submitted data objects (i.e., SIPs).

Supporting Text
The repository must ensure that the point at which it has accepted responsibility for preservation of a digital object is clear to producers/depositors.

This is necessary in order to avoid misunderstandings between the repository and producer/depositor as to when this hand-over happens.otherwise there is a risk that, for example, the original is erased before the repository has responsibility.

-- MarkConrad - 27 Mar 2008 I would suggest moving the phrase, "otherwise there is a risk that, for example, the original is erased before the repository has responsibility." to the Discussion section. It does not apply in all cases and should be included in the "mandatory" section. -- JohnGarrett - 31 Mar 2008 I agree this could be moved. Should also discuss the case that producer could change data and the changes wouldn't be caught if the producer didn't know when responsibility was passed.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Submission agreements/deposit agreements/deeds of gift; confirmation receipt sent back to producer/depositor.

Discussion

If this requirement is not met, there is a risk that, for example, the original is erased before the repository has taken responsibility for the submitted data objects.

Without the understanding that the repository has already taken preservation responsibility for the SIP. There is the risk that the producer/depositor may make changes to the data and these would not be properly archived since they had already been ingested by the repository. For example, for convenience the repository could receive a copy of raw science data from the instrument at same time the science team gets it, but the science team would have responsibility for it until they turn over responsibility to the final repository.

A key component of a repository's responsibility to gain sufficient control of digital objects is the point when the repository manages the bitstream. For some repositories this will occur when it first receives the SIP transformation, for others it may not occur until the ingested SIP is transformed into an AIP. At this point, the repository formally accepts preservation responsibility of digital objects from the producers/depositors.

-- MarkConrad - 27 Mar 2008 This paragraph appears to equate managing the bitstream with formally accepting preservation responsibility from the depositor. In my experience this is not always the case. A repository may take physical custody (i.e., manage the bitstream) before it accepts preservation responsibility for a set of digital objects.

-- BarbaraSierman - 29 Mar 2008 ,I agree with Mark, the difference between the two situations is important (manage the bitstream, which is the case before any check has occured) and taking preservation responsibility (after various checks). It is important to be clear to the producer in this respect, as it also influences the procedures of handling rejected digital objects.?

-- JohnGarrett - 31 Mar 2008 I think that starting off with physical control issue is confusing. It needs to be part of the decision, but may not be the point where preservation responsibility is passed. For example, for convenience the repository could receive a copy of raw science data from the instrument at same time the science team gets it, but science team would have responsibility for it until they turn over responsibility to the final repository.

Repositories that report back to their depositors generally will mark this acceptance with some form of notification to the depositor. (This may depend on repository responsibilities as designated in the depositor agreement.) A repository may mark the transfer by sending a formal document, often a final signed copy of the transfer agreement, back to the depositor signifying the completion of the transformation from SIP to AIP process. Other approaches are equally acceptable. Brief daily updates may be generated by a repository that only provides annual formal transfer reports.

| Bruce Ambacher - For uniformity all references to depositors should be "producers/depositors" in Supporting Text and Discussion. Why were "confirmation receipts" eliminated? They seem to be a most pertinent type of notification. |

- RobertDowns - 07 Apr 2008 - We might insert the word "accepted" before "responsibility" at the end of the first sentence under Discussion, as well as correcting the spelling of the word "requirement" in that same sentence.

B1.8 Repository has contemporaneous records of actions and administration processes that are relevant to preservation (Ingest: content acquisition).

Supporting Text
The repository must document relevant events as they happen.

This is necessary in order to avoid such documentation, which might be evidence in an audit, from being omitted or erroneous or of questionable authenticity. This is necessary to ensure that documentation which may be needed in an audit is captured and is accurate.

-- BarbaraSierman - 29 Mar 2008 This sentence might need some reordering as it is difficult to understand at first sight. But I'm not a native speaker in this language.... -- JohnGarrett - 31 Mar 2008 How about rewriting it as "This is necessary to ensure that documentation which may be needed in an audit is captured and is accurate.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Written documentation of decisions and/or action taken; preservation metadata logged, stored, and linked to pertinent digital objects, confirmation receipts sent back to providers

Discussion
These records must be created on or about the time of the actions they refer to and are related to actions taken during the Ingest: content acquisition process. The records may be automated or may be written by individuals, depending on the nature of the actions described. Where community or international standards are used, such as PREMIS (2005), the repository must demonstrate that all relevant actions are carried through.

-- MarkConrad - 27 Mar 2008 I would remove the reference to PREMIS. PREMIS is not a standard. PREMIS often does not list what the "relevant action" should be.

B2. Ingest: creation of the archivable packageAIP

Old section B2

B2.1 Repository has an associated, written definition for each AIP or class of AIPs preserved by the repository that is adequate to fit long-term preservation needs.

Supporting Text

B.2.1.1. The repository must have an associated, written definition for each AIP or class of AIPs preserved by the repository.

This is necessary to ensure that the AIP and its associated definition can always be found and managed within the archive.

B.2.1.2. It must be possible to determine which definition applies to which AIP.

This is necessary to ensure each AIP can be properly parsed/interpreted.

B.2.1.3. The repository must be able to demonstrate that the definition of each AIP is adequate for long term preservation by demonstrating that it has all the required components, each of which can be maintained over time. An AIP contains these key components: the primary data object to be preserved, its supporting Representation Information (format and meaning of the format elements), and the various categories of Preservation Description Information (PDI) that also need to be associated with the primary data object: Fixity, Provenance, Context, and Reference. There should be a definition of how these categories of information are linked.

This is necessary in order to explicitly show that the AIPs are fit for purpose, that each component of an AIP has been adequately thought through and the plans for the maintenance of each AIP are in place. (See B.3 Preservation planning, below)

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Documentation identifying each class of AIP and describing how each is implemented within the repository. Implementations may, for example, involve some combination of files, databases, and/or documents. Documentation that relates the AIP component’s contents to the related preservation needs of the repository, with enough detail for the repository's providers and consumers to be confident that the significant properties of AIPs will be preserved. It should be clear how AIP components such as Representation Information and Provenance can be managed and kept up to date. It should be clear when new versions of AIPs need to be created in order to keep them fit for purpose. The external dependencies of the AIP should also be recorded.

Discussion
It is necessary that definitions exist for each AIP, or class of AIP if there are many instances of the same type. Repositories that store a wide variety of object types may need a specific definition for each AIP they hold, but it is expected that most repositories will establish class descriptions that apply to many AIPs. It must be possible to determine which definition applies to which AIP. It may also be necessary for the definitions to say something about the semantics or intended use of the AIPs if this could affect long-term preservation decisions. For example, say two repositories both only preserve digital still images, both using multi-image TIFF files as their preservation format. Repository 1 consists entirely of real-world photographic images intended for viewing by people and has a single definition covering all of its AIPs. (The definition may refer to a local or external definition of the TIFF format.) Repository 2 contains some images, such as medical x-rays, that are intended for computer analysis rather than viewing by the human eye, and other images that are like those in Repository 1. Repository 2 should perhaps define two classes of AIPs, even though it only uses one storage format for both. A future preservation action may depend on the intended use of the image—an action that changes the bit-depth of the image in a way that is not perceivable to the human eye may be satisfactory for real-world photographs but not for medical images, for example.

B2.2 Removed (combined with B.2.1.)

B2.3 Repository has a description of how AIPs are constructed from SIPs.

Supporting Text

B.2.3.1. The repository must be able to show how the AIP(s) is constructed from the SIP(s).

This is necessary in order to ensure that the AIP(s) adequately represents the information in the SIP(s).

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Process description documents; documentation of SIP relationship to AIP; clear documentation of how AIPs are derived from SIPs; documentation of standard/process against which normalization occurs; documentation of normalization outcome and how outcome is different from SIP.

*Discussion*
In some cases, the AIP and SIP will be almost identical apart from packaging and location, and the repository need only state this. In other cases, complex transformations (e.g., data normalization) may be applied to objects during the ingest process, and a precise description of these actions may be necessary to reflect how the AIP(s) has been adequately transformed from the information in the SIP(s).

The AIP construction description should include documentation that gives a detailed description of the ingest process for each SIP to AIP transformation, typically consisting of an overview of general processing being applied to all such transformations, augmented with description of different classes of such processing and, when applicable, with special transformations that were needed.

Some repositories may need to produce these complex descriptions case by case, in which case diaries or logs of actions taken to produce each AIP will be needed. In these cases, documentation needs to be mapped to individual AIPs, and the mapping needs to be available for examination. Other repositories that can run a more production-line approach may have a description for how each class of incoming object is transformed to produce the AIP. It must be clear which definition applies to which AIP. If, to take a simple example, two separate processes each produce a TIFF file, it must be clear which process was applied to produce a particular TIFF file.

B2.4 Repository can demonstrate that all SIPs are either accepted as whole or part of an eventual AIP or otherwise disposed of in a recorded fashion.

Supporting Text

B.2.4.1. The repository must be able to show that each SIP has either been used in creating one or more AIPs or else has been discarded (and if so why).

B.2.4.2. If an SIP is discarded the repository must create a record of that activity that indicates why the SIP was discarded.

This is necessary in order to ensure that the SIPs received have been dealt with appropriately, and in particular have not been accidentally lost.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
System processing files; disposal records; donor or depositor agreements/deeds of gift; provenance tracking system; system log files. Process description documents; documentation of SIP relationship to AIP; clear documentation of how AIPs are derived from SIPs; documentation of standard/process against which normalization occurs; documentation of normalization outcome and how outcome is different from SIP

Discussion
The timescale of this process will vary between repositories from seconds to many months, but SIPs must not remain in a limbo-like state forever. The accessioning procedures and the internal processing and audit logs should maintain records of all internal transformations of SIPs to demonstrate that they either become AIPs (or part of AIPs) or are disposed of. Appropriate descriptive information should also document the provenance of all digital objects.

B2.5 Repository has and uses a naming convention that generates visible, persistent, unique identifiers for all AIPs.

Supporting Text

B.2.5.1. The repository must be able to show how any AIP can be uniquely identified.

B.2.5.1.1. It must be possible to demonstrate that the identifiers are unique.
B.2.5.1.2. Documentation must show how the persistent identifiers of the AIP and its components are assigned and maintained so as to be unique within the context of the repository.
B.2.5.1.3. The documentation must also describe any processes used for changes to such identifiers.
B.2.5.1.4. It must be possible to obtain a complete list of all such identifiers and do spot checks for duplications.
B.2.5.1.5. The ID system must be seen to fit the repository’s current and foreseeable future requirements for things like numbers of objects.
This is necessary in order to ensure that each AIP can be unambiguously found in the future. This is also necessary to ensure that each AIP can be distinguished from all other AIPs in the repository.
B.2.5.1.2. The repository must have a system of reliable linking/resolution services in order to find the uniquely named object, no matter its physical location.
This is so that actions relating to AIPs can be traced over time, over system changes, and over storage changes.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Documentation describing naming convention and physical evidence of its application (e.g., logs)

Discussion
A repository needs to ensure that an accepted, standard naming convention is in place that identifies its materials uniquely and persistently for use both in and outside the repository. The “visibility” requirement here means “visible” to repository managers and auditors. It does not imply that these unique identifiers need to be visible to end users or that they serve as the primary means of access to digital objects. Ideally, the unique ID lives as long as the AIP; if it does not, there must be traceability. Note that B2.1 requires that the components of an AIP be suitably bound and identified for long-term management, but places no restrictions on how AIPs are identified with files. Thus, in the general case, an AIP may be distributed over many files, or a single file may contain more than one AIP. Therefore identifiers and filenames may not necessarily correspond to each other. Documentation must represent these relationships.

B2.6 (Removed)

B2.7 Repository demonstrates that it has access to necessary tools and resources to provide authoritative Representation Information for all of the digital objects it contains.

Supporting Text

B.2.7.1. The repository must have tools or methods to identify the file type of all submitted Data Objects.

B.2.7.2. The repository must have tools or methods to determine what Representation Information is necessary to make each Data Object understandable to the Designated Communit(ies).

B.2.7.3. The repository must ensure that it has access to the requisite Representation Information.

B.2.7.4. The repository must have tools or methods to ensure that the requisite Representation Information is persistently associated with the relevant Data Objects.

This is necessary in order to ensure that the repository’s digital objects are understandable to the Designated Communit(ies).

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Subscription or access to such registries; association of unique identifiers to registries of Representation Information (including format registries); Viewable records in local registries (with persistent links to digital objects); database records that include Representation Information and a persistent link to relevant digital objects.

Discussion
These tools and resources can be held internally or can be shared via, for example, a trusted set of registries. However this requirement does not demand that each repository has such tools and resources, merely that it has access to them. The Global Digital Format Registry (GDFR), the UK National Archives’ file format registry PRONOM, and the UK Digital Curation Centre’s Registry Repository of Representation Information (RRORI) are three emerging examples of external registries a repository might adopt. Any such registry is a specialised type of repository, which itself must be certified/trusted. The repository may use these types of standardized, authoritative information sources to identify and/or verify the Representation Information components of Content Information and PDI. This will reduce the long-term maintenance costs to the repository and improve quality control. Sometimes there is both general representation information (e.g. format information) and also specific representation information (e.g., meanings of individual fields within a dataset). Often the general information will be available in an external repository, but the local repository may need to maintain the instance specific information. It is likely that many repositories would wish to keep local copies of relevant Representation Information; however this may not be practical in all cases. Even where a repository strives to keep all such information locally there may be, for example, a schedule of updates which means that until an update is performed, the local Representation Information is incomplete. This may be regarded as a kind of local caching of, for example, the Representation Information held in registries. Alternatively one may say that in these cases, the use of international registries is not meant to replace local registries but instead serve as a resource to verify or obtain independent, authoritative information about any and all Representation Information. Good practice suggests that any locally held Representation Information should also be made available to other repositories via a trusted registry. In addition any item of Representation Information should itself have adequate Representation Information to ensure that the Designated Community can understand and use the data object being preserved

B2.8 (Removed)

B2.9 Repository has documented processes for acquiring Preservation Description Information (PDI) for its associated Content Information and acquires PDI in accordance with the documented processes.

Supporting Text

B.2.9.1. The repository has documented processes for acquiring PDI.

B.2.9.2. The repository must execute its documented processes for acquiring PDI.

B.2.9.3. The repository must ensure that the PDI is persistently associated with the relevant Content Information.

This is necessary in order to ensure that an auditable trail to support claims of authenticity are available and that unauthorized changes to the digital holdings can be detected and that the digital objects can be identified and placed in their appropriate context.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Standard operating procedures; manuals describing ingest procedures; viewable documentation on how the repository acquires and manages Preservation Description Information (PDI); creation of checksums or digests, consulting with designated community about Context.

Discussion
PDI is needed not only by the repository to help ensure the Content Information is not corrupted (Fixity) and is findable (Reference Information), but to help ensure the Content Information is adequately understandable by providing a historical perspective (Provenance Information) and by providing relationships to other information (Context Information). The extent of such information needs is best addressed by members of the designated community(ies). The PDI must be permanently associated with Content Information.

B2.10 Repository ensures that AIPs are understandable for their Designated Communities at the time of ingest.

Supporting Text

B.2.10.1. Repository has a documented process for testing understandability of the AIP at ingest for their Designated Communities.

B.2.10.2. The repository must execute the testing process for each AIP at ingest.

B.2.10.3. If the AIP fails the understandability testing, the repository must bring the AIP up to the agreed level of understandability at ingest..

This is necessary in order to ensure that one of the primary tests of preservation, namely that the digital holdings are understandable by their Designated Communities, can be met. See B.3.x. for additional requirements for understandability beyond ingest.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Test procedures to be run against the digital holdings to ensure their understandability to the defined Designated Communities and their knowledge bases; records of such tests being performed and evaluated; evidence of gathering or identifying Representation Information to fill any intelligibility gaps which have been found; Retention of individuals with the discipline expertise.

Discussion
This requirement is concerned with the understandability of the AIP. If the ingested material is not understandable, the repository needs to ingest or make available additional information to make sure that the AIPs are understandable to the designated comminut(ies). For example, if documents are written in a dying language and the Designated Communities are no longer able to understand the language the documents are written in, the repository would need to provide additional documentation that would allow the Designated Communities to understand the documents (e.g., translations of the documents in a language the Designated Communities could understand or dictionaries that would allow the Designated Communities to translate the documents into a language its members understand.)

B2.11 Repository verifies each AIP for completeness and correctness at the point it is generated.

Supporting Text
The repository must be sure that the AIPs it generates are as they are expected to be by checking them against the associated written definition for each AIP or class of AIP (see B2.1) and the description of how AIPs are constructed from SIPs (see B.2.3.). This is necessary in order to ensure that what is maintained over the long term is as it should be and can be traced to the information provided by the Producers.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Description of the procedure that verifies completeness and correctness of the AIPs; logs of the procedure.

Discussion
If the repository has a standard process to verify SIPs for both completeness and correctness and a demonstrably correct process for transforming SIPs into AIPs (see B2.3), then it simply needs to demonstrate that the initial checks were carried out successfully and that the transformation process was carried out without indicating errors. On the other hand repositories that must create unique processes for many of their AIPs will also need to generate unique methods for validating the completeness and correctness of AIPs. This may include performing tests of some sort on the content of the AIP that can be compared with tests on the SIP. Such tests might be simple (counting the number of records in a file, or performing some simple statistical measure ), but they might be complex. Documentation should describe how completeness and correctness of AIPs are ensured, starting with receipt from the producer and continuing through AIP creation and supporting long-term preservation. Example approaches include the use of checksums, testing that checksums are still correct at various points during ingest and preservation, logs that such checks have been made, and any special tests that may be required for a particular AIP instance or class.

B2.12 Repository provides an independent mechanism for inventorying the integrity of the repository collection/content.

Supporting Text
The repository must provide a way to independently demonstrate the completeness and correctness of its collections and their contents. This is necessary to enable the audit of the integrity of the collection as a whole.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Documentation provided for B2.1 through B2.5; documented agreements negotiated between the producer and the repository (see B 1.1-B1.9); logs of material received and associated action (receipt, action, etc.) dates; logs of periodic checks.

Discussion
In general, it is likely that a repository that meets all the previous criteria will satisfy this one without needing to demonstrate anything more. As a separate requirement, it demonstrates the importance of being able to audit the integrity of the collection as a whole. For example, if a repository claims to have all e-mail sent or received by The Yoyodyne Corporation between 1985 and 2005, it has been required to show that:

  • The content it holds came from Yoyodyne’s e-mail servers.
  • It is all correctly transformed into a preservation format.
  • Each monthly SIP of e-mail has been correctly preserved, including original unique identifiers such as Message-IDs.
However it may still have no way of showing whether this really represents all of Yoyodyne's email. For example, if there is a three-day period with no messages in the repository, is this because Yoyodyne was shut down for those three days, or was the e-mail lost before the SIP was constructed? This case could be resolved by the repository amending its description of the collection, but other cases may not be so straightforward. A familiar mechanism from the world of traditional materials in libraries and archives is an accessions or acquisitions register that is independent of other catalog metadata. A repository should be able to show, for each item in its accessions register, which AIP(s) contain content from that item. Alternatively, it may need to show that there is no AIP for an item, either because ingest is still in progress, or because the item was rejected for some reason. Conversely, any AIP should be able to be related to an entry in the acquisitions register.

B2.13 Repository has contemporaneous records of actions and administration processes that are relevant to AIP creation.

Supporting Text
The repository must create records of its preservation related activities essentially as they happen. This is necessary in order to be sure that nothing relevant is omitted from the record that might be needed to provide an independent means to verify that all AIPs have been properly created in accord with the documented procedures (see B.2.1. through B.2.12).

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement
Written documentation of decisions and/or action taken; preservation metadata logged, stored, and linked to pertinent digital objects.

Discussion
These records must be created on or about the time of the actions they refer to and are related to actions associated with AIP creation. The records may be automated or may be written by individuals, depending on the nature of the actions described. Where community or international standards are used, the repository must demonstrate that all relevant actions are carried through.

Is this a complete list of all the mandatory requirements for creation of AIPs?

B3. Preservation planning

-- BruceAmbacher - 11 May 2008 Mark references the introductory text for B3 but it is not here. Do we have a mechanism to be sure it gets inserted in the final draft? Should it be reviewed at the same time we review the requirements

-- JohnGarrett - 19 May 2008 - We will need to review the introductory text, but I would prefer to make the requirements capable of standing on their own. I would also like to make as much progress quickly. So I would like to complete the review of the requirements and then come back and review/add the additional text.

B3.1 Repository has documented preservation strategies relevant to its holdings.

-- BarbaraSierman - 19 May 2008 "relevant to the contents of its archive" or something like that.

-- DavidGiaretta - 19 May 2008 - agree

Supporting Text

The repository must have current, sound, and documented preservation strategies.

This is necessary in order to make the information available and usable for future generations and to provide a means to check and validate the preservation work of the repository.

-- MarkConrad - 04 Feb 2008 I am not sure about this second sentence. Appendix 5 of the TRAC document seems to lay out many sub -requirements that are not in this section. Both the introductory text for B.3. and Appendix 5 hint at why preservation strategies are necessary, but they do not explicitly state why.

-- JohnGarrett - 19 May 2008 - I don't think we should incorporate extra requirements just by referencing them in an annex. If there are requirements there that are not included here, I would like to add extra requirements.

-- JohnGarrett - 19 May 2008 - I think the maintaining is intended to be covered in B4. I think here in B3, we are trying to go beyond just maintaining. I think the intention is that we know that things change over time and we will need to make changes to our holdings over time - either by changing them or adding additional information or emulation layers - so future users can still make use of them. But "made available" may sound too much like access, but I think the intention is that it remains understandable to the future Designated Community.

-- BarbaraSierman - 19 May 2008 As digital preservation is both storage and access, I agree with Johns comment

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

Documentation identifying each preservation issue and the strategy for dealing with that issue.

Discussion

These preservation strategies will typically address the degradation of storage media, the obsolescence of media drives, and the obsolescence of Representation Information (including formats), safeguards against accidental or intentional digital corruption. For example, if migration is the chosen approach to some of these issues, there also needs to be Preservation Implementation Planspolicy on what triggers a migration and what types of migration are expected for the solution of each preservation issue identified.

-- MarieWaltz - 06 May 2008 If we keep Appendix B, it may be better to just refer them to that. (Media drives? Do we still use those?)

-- BruceAmbacher - 11 May 2008 Yes media drives are still used for both magnetic tape and optical media.

B3.2 Repository has mechanisms in place for monitoring and notification when Representation Information is inadequate for the Designated Community(ies) to understand the data holdings, and mechanisms for creating or identifying or gathering any extra Representation Information required.

Repository has mechanisms in place for monitoring and notification when Representation Information (including formats) approaches obsolescence or is no longer viable.

-- JohnGarrett - 19 May 2008 - I think formats are bit different than Representation Information. Formats seem to me to be part Representation Information, but also partly packaging of the Content Information. Overall this requirement seems to be much more concerned with formats becoming obsolete rather than needing to supplement Representation Information to still be able to understand the Content Information.

-- DavidGiaretta - 19 May 2008 - yes, this is too focussed on obsolescence and formats. Also needs to be a bit more pro-active than "notification". It should perhaps be something like
"Repository has mechanisms in place for monitoring and notification when Representation Information network is inadequate for the Designated Community(ies) to understand the data holdings, and mechanisms for creating or identifying or gathering any extra Representation Information required."

Supporting Text

The repository must show that it has some active mechanism to ensure that the preserved information remains understandable and usable by the designated community(ies).

The repository must show that it has some active mechanism to warn of impending obsolescence. Obsolescence is determined largely in terms of the knowledge base of the designated community(ies).

-- DavidGiaretta - 19 May 2008 -suggest "changes. These changes are defined " instead of "obsolescence. Obsolescence is determined "

This is necessary in order to ensure that the preserved information remains understandable and usable by the designated community(ies).

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

Percentage of at least one staff member dedicated to monitoring technological obsolescence issues and practices of the designated community; subscription to a format registry service; subscription to a technology watch service. percentage of at least one staff member dedicated to monitoring technological obsolescence issues and practices of the designated community.

-- BarbaraSierman - 19 May 2008 The subscriptions are not really possible at the moment... I would suggest to formulate it less strict to one staffmember, ofcourse an institution should organize this preservation watch, but it is up to the institution itself how many staff members it will appoint.

-- DavidGiaretta - 19 May 2008 - yes should be less strict. Agree that subscription services are not yet available but we should anticipate such services.

Discussion

For most repositories, the concern will be with the Representation Information (including formats) used to preserve information, which may include information on how to deal with a file format or software that can be used to render or process it. Sometimes the format needs to change because the repository can no longer deal with it. Sometimes the format is retained and the information about what software is needed to process it needs to change. If the mechanism depends on an external registry, the repository must demonstrate how it uses the information from that registry.

-- DavidGiaretta - 19 May 2008 - instead of "including formats" it might be better to say "formats are an example of Structural Rep. Info."

B3.3 Repository has mechanisms to change its preservation plans as a result of its monitoring activities.

Supporting Text

The repository must demonstrate or describe how it reacts to information from monitoring, which sometimes requires a repository to change how it deals with the material it holds in ways that could not have been anticipated at an earlier stage.

This is necessary in order for the repository to be prepared for changes in the external environment that may make its current preservation plans a bad choice as the time to implement draws near.

-- BarbaraSierman - 19 May 2008 The last part of the sentence seems somewhat strange to me, may be we could replace it by something like "that makes updating the current preservation plans necessary".

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

Preservation Strategic PlansPreservation planning policies tied to formal or informal technology watch(es); preservation planning or processes that are timed to shorter intervals (e.g., not more than five years); proof of frequent Preservation Policies and Preservation Strategic Planspreservation planning/policy updates; sections of Preservation Policiespreservation planning policies that address how plans may be updated and that address how often the plans are required to be reviewed and reaffirmed or updated.

-- JohnGarrett - 19 May 2008 - I think these are mostly ways of recognizing that updates or needed (seems more appropriate to B3.2) or that updates have happened (maybe B3.4). I would say something like "sections of preservation planning policies that address how policies may be updated and that address how often the policies are required to be reviewed and reaffirmed or updated."

-- BarbaraSierman - 19 May 2008 It is too early now, but a later possibility could be to refer to accepted practices (te be found for example in external registries)

Discussion

Plans as simple as migrating from format X to format Y when the registries show that format X is no longer supported are not sufficiently flexible; other events may have made format Y a bad choice. The repository must periodically review it's preservation plans and the technology environment and, if necessary, makes changes to those plans to ensure their continued effectiveness. Another possible response to information gathered by monitoring is for the repository to create additional Representation Information and/or PDI.

-- JohnGarrett - 19 May 2008 - I'm not really sure what point the first sentence here is making related to this requirement.

B3.4 Repository can provide evidence of the effectiveness of its preservation activities.

Supporting Text

The repository must be able to demonstrate the continued preservation, including understandability, of its holdings. over a number of years, given the age of the repository and its holdings.

This is necessary in order to assure the Designated Community(ies) that the repository will be able to make the information available and usable over the mid-to-long-term.

-- MarkConrad - 04 Feb 2008 Is the above statement even close to the original intention of the authors of the TRAC document? The "why" is not explicitly stated in the text as it is currently written.
-- DavidGiaretta - 30 Jun 2008 "given the the age..." perhaps should be "which will increase with the age..." since the length of time one should be able to go back and retrieve and check the usability of the info will increase over time.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

Collection of appropriate preservation metadata; proof of usability of randomly selected digital objects held within the system; demonstrable track record for retaining usable digital objects over time; Designated Community polls.

Discussion

This could be evaluated at a number of degrees and depends on the specificity of the Designated Community(ies). If a Designated Community is fairly broad, an auditor could represent the test subject in the evaluation. More specific Designated Communities could require significant efforts. If judgment must be exercised as to whether adequate efforts have been made, it must be justified in detail.

-- JohnGarrett - 19 May 2008 - I think judgments will always need to be made. Is there some specific requirement intended here?

-- BarbaraSierman - 19 May 2008 Is it only the Designated Community that can judge the effectiveness of the preservation plans? The organization itself should also have policies that should be tested (loss of information as a result of migrations etc.)

-- MarkConrad - 25 Jun 2008 We still need to go over David's cross-walk of Appendix 5 of the TRAC document and the requirements in B.3. to ensure that we have captured all of the requirements that should be listed in this section.

-- MarkConrad - 26 Jun 2008 In B.2.10. we inserted a reference to a B.3.x as a needed requirement to continue to test the understandability of AIPs beyond ingest. Does B.3.2. adequately meet this need or do we need another requirement in B.3?

- JohnGarrett - 07 Jul 2008 - Following text was originally from B4.2, but today's meeting determined that it should be included in B3.

These documented preservation strategies must include evidence of planning for strategies not yet employed against the repository’s digital objects.
-- DavidGiaretta - 30 Jun 2008 - AIPs rather than digital objects -- JohnGarrett - 07 Jul 2008 - I don't see a need to require strategies that are not yet employed. But I do think there is a need to be able to include those new strategies. Could the sentence above be changed to something like
"The repository must demonstrate that it has documented preservation strategies and the strategy document must include a means to update and add to the strategies employed."

This is necessary in order for the repository to be prepared for changes in the external environment that may make its current preservation plans a bad choice as the time to implement draws near.

Minimally, documentation of Preservation Implementation Planspreservation strategies must be included in repository policies and practices.

This is necessary in order to assure funders, producers, and users - and allow them to verify this validate - that the repository is taking sound steps to preserve all of the properties it has asserted it will preserve for each digital object. (See B1.1)


-- MarkConrad - 25 Jun 2008 The supporting text seems more appropriate for B.3.1. B.4.1. is supposed to be about implementing (employing) the preservation strategies that are supposed to be created under B.3.1.
-- DavidGiaretta - 30 Jun 2008 - agree with Mark

-- MarkConrad - 25 Jun 2008 It is not clear what the distinction is between preservation policies, preservation plans and preservation strategies. If there is one we need to make it explicit.

-- BruceAmbacher - 11 May 2008 This sentence implies others (not the staff) can determine the actions taken by the repository; I am uncomfortable with that implication.

-- MarkConrad - 26 Jun 2008 Bruce, I disagree. I think this sentence says that others can determine whether or not they are satisfied with the actions taken by the repository. It does not say that they can dictate what those actions should be.

[End of text moved from B4.1]

B4. Archival storage & preservation/maintenance of AIPs

-- DavidGiaretta - 30 Jun 2008 In order to draw a distinction between this section and B3, we should make it clear that B3 is about the Information content of the AIP while B4 is about the whole AIP itself. This would bring in re-packaging, indexing, bit storage, provenance (as the AIP is moved, copied, transformed)
-- DavidGiaretta -Instead of "preservation/maintenance" perhaps just "maintenance" since we are talking here about things akin to (but not limited to) bit-level, which is contrasted with "preservation" which involves understandability .
-- MarkConrad - 30 Jun 2008 David, I did not understand that to be the difference between the two sections. I thought the difference between the two sections was that B.3. was about having preservation strategies/plans and B.4. was about implementing those plans.

B4.1 (Removed)

-- JohnGarrett - 07 Jul 2008 - In today's meeting, it was decided that the planning part of the content belongs in B3 and remaining content was a more general statement of B4.2 and should be folded into B4.2.

Requirement

Repository employs documented preservation strategies.
-- DavidGiaretta - 30 Jun 2008 - append "for its AIPs"

Supporting Text

Good repository practice also requires that preservation strategies employed against digital objects are recorded in the object’s preservation metadata. (See also B3.3., and B.2.8.)
A repository must record actions taken against digital objects in support of preservation plans (see B3.3) in the object’s preservation metadata.

This is necessary in order to demonstrate that the repositoryto show that the repository has implemented its documented preservation strategies and can produce authentic copies of the original or objects traceable to the originals. (See B.6.10.)


-- MarkConrad - 07 Feb 2008 Was this the original intent for this sub-requirement?
-- DavidGiaretta - 30 Jun 2008 - No it was not

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

Documentation of strategies and their appropriateness to repository objects; evidence of application (e.g., in preservation metadata or logs); see B3.3.


-- JohnGarrett - 07 Jul 2008 - We should probably require evidence that strategies were applied. I think that needs to be added above or removed from evidence section here. Or is that B4.2?

Discussion

A repository is likely to employ multiple strategies. Different strategies may be employed by class (type) of digital object, and/or multiple strategies may be employed on a single object class. This will depend upon local repository policies and practices, though any such strategy decisions should be documented and should be based on sound community practice.

B4.2 Repository implements/responds to strategies for archival object (i.e., AIP) bit-level storage and migrationemploys the appropriate strategies for maintaining Information content in a format acceptable/usable by the designated community.


-- MarkConrad - 26 Jun 2008 I would suggest replacing "archival object (i.e., AIP)" with "AIP".

Supporting Text

-- MarkConrad - 07 Feb 2008 I have no idea what to put here. I do not understand what the supporting text in the current TRAC document is asking the repository to do.
-- DavidGiaretta - 30 Jun 2008 - I think the TRAC means (1) bit preserve and if necessary (2) migrate

The repository must store the AIPs at the bit level or, when necessary, migrate them to an equivalent form by re-packagingmaintain them in a form that ensures that the Content Information of the AIPs is usable by the designated community.

This is necessary in order to ensure that the Content Information of the AIPs can be extracted.

-- JohnGarrett - 07 Jul 2008 - I think bit preservation is necessary. I'm not sure migration is. I know others would argue for emulation and although I don't support emulation for long-term use, I think we should allow that strategy and not require migration.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

Institutional technology and standards watch; demonstration of objects on which a preservation strategy has been performed; demonstration of appropriate preservation metadata for digital objects.

Discussion

At least two aspects of the strategy must be acted upon: that which pertains to how AIPs are currently stored (including physical requirements, media requirements, location of copies, formats and metadata) and that which may require AIP migration of any form. For example, AIP migrations that result in transformations of content need to be tracked to allow subsequent users to understand the repository’s processing implications.

If a repository has not yet needed to carry out any sort of Preservation Implementation Planspreservation strategy on AIP(s), it must demonstrate that its Preservation Policypolicy has not required it yet.

-- KatiaThomaz - 21 Mar 2008 - I vote to remove this item

-- BruceAmbacher - 11 May 2008 I believe the original intent was to have the repository show its preservation technology watch function and its awareness of designated community capabilities would be implemented when required to continue to produce usable data objects. The repository also must document the actions taken so users can understand the potential impact on the data objects.

-- MarkConrad - 26 Jun 2008 Bruce, that sounds like multiple requirements some of which may be covered elsewhere:

-- MarieWaltz - 26 Jun 2008 If Bruce's requirements are really addressed elesewhere then there is no need for this criteria.

"have the repository show its preservation technology watch function and its awareness of designated community capabilities would be implemented when required to continue to produce usable data objects." Isn't this covered by B.3.2 though B.3.4?

"The repository also must document the actions taken so users can understand the potential impact on the data objects." Isn't this covered by B.4.5?

B4.3 Repository preserves the Content Information of archival objects (i.e., AIPs) AIPs.

-- MarkConrad - 26 Jun 2008 I suggest replacing "archival objects (i.e., AIPs)" with "AIPs".

Supporting Text

The repository must be able to demonstrate that the AIPs faithfully reflect what was captured during ingest and that any subsequent or future planned transformations will continue to preserve that aspect of the repository’s holdings.

This is necessary in order to to assure funders, producers, and users - and allow them to validate verify - that the repository is taking sound steps to preserve all of the properties it has asserted it will preserve for each digital object. (See B1.1)

-- BruceAmbacher - 11 May 2008 See my statement in B4.1. I have the same concerns here.

-- MarkConrad - 26 Jun 2008 Bruce, I disagree. I think this sentence says that others can determine whether or not they are satisfied with the actions taken by the repository. It does not say that they can dictate what those actions should be.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

PreservationPolicy documents specifying treatment of AIPs and whether they may ever be deleted; ability to demonstrate the chain of AIPs for any particular digital object or group of objects ingested; workflow procedure documentation. Preservation Policy documents specifying treatment of AIPs and under what circumstances they may ever be deleted; ability to demonstrate the sequence of conversions for an AIP for any particular digital object or group of objects ingested; workflow procedure documentation; documentation of the persistent links between the ingested object and the current AIP.

-- JohnGarrett - 07 Jul 2008 - Is there a requirement that the chain of AIPs be maintained or only that we can document that any ingested object is stored in current AIP (or has been approporiately deleted).

Discussion

One approach to this requirement assumes that the repository has a policy specifying that AIPs cannot be deleted at any time. This particularly simple and robust implementation preserves links between what was originally ingested, as well as new versions that have been transformed or changed in any way. Depending upon implementation, these newer objects may be completely new AIPs or merely updated AIPs. Either way, persistent links between the ingested object and the AIP should be maintained.

-- JohnGarrett - 07 Jul 2008 - I don't like this discussion. I think it may be wrong to think that Information can be preserved by maintaining the original AIP. May even be in contradiction to other requirements that require strategies for migration.

B4.4 Repository actively monitors integrity of archival objects (i.e., AIPs)AIPs.

Supporting Text

In OAIS terminology, this means that the repository must have Fixity Information for AIPs and must make some use of it

This is necessary in order to protect the integrity of the archival objects over time.

A repository should have logs that show actions taken to check the integrity of archival objects.

This is necessary in order to assure funders, producers, and users - and to allow them to audit/validate - that the repository is taking the necessary steps to ensure the long-term integrity of the digital objects.

This is necessary to document that integrity checks are carried out on a regular basis and to allow interested parties to verify that this is the case.

-- BruceAmbacher - 11 May 2008 See my comment for B4.1. I have the same concerns here.

-- MarkConrad - 26 Jun 2008 Bruce, I disagree. I think this sentence says that others can determine whether or not they are satisfied with the actions taken by the repository. It does not say that they can dictate what those actions should be.

AIP integrity also needs to be monitored at a higher level, ensuring that all AIPs that should exist actually do exist, and that the repository does not possess AIPs it is not meant to.

A repository should have logs that show actions taken to check the integrity of the repository as a whole.

This is necessary to document that integrity checks are carried out on a regular basis and to allow interested parties to verify that this is the case.

This is necessary in order to validate the integrity of the repository as a whole.

-- MarkConrad - 26 Jun 2008 What is the relationship between this sentence pair and B.2.12.? Should this sentence pair be deleted? Turned into a separate requirement? -- JohnGarrett - 07 Jul 2008 - I think they should be turned into a separate requirement. Not

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

Logs of fixity checks (e.g., checksums); documentation of how AIPs and Fixity information are kept separate; documentation of how accession registers and AIPs are kept separate; fixity information for each ingested digital object/AIP.

-- JohnGarrett - 07 Jul 2008 - fixity information for each ingested digital object/AIP

Discussion

At present, most repositories deal with this at the level of individual information objects by using a checksum of some form, such as MD5. In this case, the repository must be able to demonstrate that the Fixity Information (checksums, and the information that ties them to AIPs) are stored separately or protected separately from the AIPs themselves, so that someone who can maliciously alter an AIP would not likely be able to alter the Fixity Information as well.

-- DavidGiaretta - 11 Aug 2008 - The issue of integrity is more complex in that the Provenance (part of the AIP) could include info about copying and refreshing which will not change anything other than the Provenance. So in principle the AIP will change each time. We must somehow make it clear that we are concerned with the integrity of the AIP excluding changes in Provenance.

-- KatiaThomaz - 21 Mar 2008 - What about interpretability and readability (see Glossary)? Wouldn´t they be mentioned anywhere?

B4.5 Repository has contemporaneous records of actions and administration processes that are relevant to preservation (Archival Storage).

Supporting Text

The repository must create records on or about the time of the actions they refer to and the records must be related to actions associated with archival storage.

This is necessary in order to ensure documentation, which might be used as evidence in an audit, is not omitted or erroneous or of questionable authenticity.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

Written documentation of decisions and/or action taken; preservation metadata logged, stored, and linked to pertinent digital objects.

Discussion

The records may be automated or may be written by individuals, depending on the nature of the actions described. Where community or international standards are used, the repository must demonstrate that all relevant actions are carried through.

B5. Information management

B5.1 Repository specifies minimum metadataDescriptive Information requirements to enable the designated community(ies) to discover and identify material of interest.

-- DavidGiaretta - 11 Aug 2008 - should specify that the requirement is on the Producer or else e.g. for Web archiving, may have to be created by the Archive. Should we use the OAIS terminology - Descriptive Information - it is used in B5.3 and B5.4?

-- DavidGiaretta - 14 Oct 2008 - It is not clear that we need this metric is actually needed because it is not concerned with preservation - or at least not clear if we allow "minimum" can be zero. Perhaps the preservation aspect would be that all the holdings can be located?

Supporting Text

The repository must be able to deal with the types of requests that will come from a typical user from the designated community(ies).
Main.BarbaraSierman - 13 Nov 2008: Advise to use standards like Dublin Core etc?
This is necessary in order to enable discovery of the repository's holdings.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

Retrieval and descriptive metadata Descriptive Information. Dublin Core, and other descriptive information, such as documentation of the object.

-- MarkConrad - 14 Nov 2008 Are we going to use "descriptive metadata" or "Descriptive Information"? See the previously made change at the item text above.

Discussion
Retrieval metadata is distinct from descriptive metadata Descriptive Information that describes what has been found. For example, in a library we might say that a book's title is mandatory, but publisher is not, because people generally search on the title.

-- MarkConrad - 14 Nov 2008 Are we going to use "descriptive metadata" or "Descriptive Information"? See the previously made change at the item text above.

A repository does not necessarily have to satisfy every possible request.

B5.2 Repository captures or creates minimum descriptive metadata Descriptive Information and ensures that it is associated with the archived object (i.e., AIP) AIP.

-- MarkConrad - 14 Nov 2008 Are we going to use "descriptive metadata" or "Descriptive Information"? See the previously made change at the item text at B.5.1.. For consistency's sake I would change, "archived object (i.e., AIP)" to "AIP".

-- DavidGiaretta - 14 Oct 2008 - need to check how this ties in with metrics in B1 - ingestion from Producer

Supporting Text

The repository must show how it gets required metadata either from the producer or by producing it itself at ingest. This is required in order to make it clear to producers and users how metadata is acquired.

-- BarbaraSierman - 13 Nov 2008 "required metadata"? we need to be more specific, like descriptive metadata. The archival agreement with the Producer (related to B1.2) will show the descriptive metadata received from the Producer, the repository must also be able to show which metadata it added to this set.
-- MarkConrad - 14 Nov 2008 This supporting text actually seems to be a separate requirement. The B.5.2. Item says the repository must create or capture the Descriptive Information. The supporting text says that the repository must document how it creates or captures the Descriptive information.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

Descriptive metadata; persistent identifier/locator associated with AIP see also B2.5 about persistent id; system documentation and technical architecture; depositor agreements; metadata policy documentation, incorporating details of metadata requirements and a statement describing where responsibility for its procurement falls; process workflow documentation.
-- BarbaraSierman - 13 Nov 2008 In B2.5 we call it a persistent, unique identifier, may be we should add that this could also be an internal persistent unique identifier [only used within the repository]
Discussion

Associating the metadata with the object is important, though it does not require one to one correspondence, and metadata may not be stored with the AIP. Hierarchical schemes of description allow some descriptive elements to be associated with may items. The association should be unbreakable – it must not be lost even if other associations are created.

-- MarkConrad - 14 Nov 2008 In the discussion for the previous item it says, "Retrieval metadata is distinct from descriptive metadata that describes what has been found." The discussion here seems to be talking about descriptive metadata.

-- KatiaThomaz - 21 Mar 2008 - Does "descriptive metadata" include "packaging metadata"?***This comment was tabled. The group felt there was a need for requirement(s) related to packaging metadata, but felt that this is not the right place to insert that requirement.

B5.3 Repository can demonstrate that referential integrity is created between all archived objects (i.e., AIPs) and associated descriptive information.Repository can demonstrate that all AIPs are associated with their descriptive information and there is descriptive information for each AIP.

-- MarkConrad - 14 Nov 2008 For consistency's sake I would change, "archived objects (i.e., AIPs)" to "AIPs".

-- DavidGiaretta - 11 Aug 2008 - Not sure that referential integrity can be "created" - the links between the two can be created and tested.

-- MarkConrad - 14 Nov 2008 How about, "Repository can demonstrate that referential integrity is enforced between all AIPs and associated descriptive information."

Supporting Text

The repository must ensure that every AIP has some descriptive information and all descriptive information must point to at least one AIP.

This is necessary in order to validate the integrity of an AIP. This is necessary to ensure all AIPs can be located and retrieved.

-- MarkConrad - 14 Nov 2008 "This is necessary to ensure all AIPs can be located and retrieved."

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

Descriptive metadata; persistent identifier/locator associated with AIP; documented relationship between AIP and metadata; system documentation and technical architecture; process workflow documentation.

Discussion

This should be an easy requirement to satisfy and is a prerequisite for the next one.
-- BarbaraSierman - 13 Nov 2008 It is not up to this document to decide whether it is an easy requirement or not.

B5.4 Repository can demonstrate that it maintains the associations between its AIPs and their descriptive information.

Repository can demonstrate that referential integrity is maintained between all archived objects (i.e., AIPs) and associated descriptive information.

-- MarkConrad - 02 Dec 2008 Given our discussions about referential integrity, how about, "Repository can demonstrate that it maintains the associations between its AIPs and their descriptive information."?

Supporting Text

The repository must ensure that the linkages between AIPs and their descriptive information are maintained over time.

This is necessary to ensure all AIPs can continue to be located and retrieved.

The repository must pay particular attention to operations that affect AIPs and their identifiers, and how integrity is maintained during these operations.

-- MarkConrad - 02 Dec 2008 How about, "The repository must ensure that the linkages between AIPs and their descriptive information are maintained over time."?

This is necessary as some operations may destroy the integrity of an AIP, and therefore devalue it.

-- MarkConrad - 14 Nov 2008 "This is necessary to ensure all AIPs can continue to be located and retrieved."

-- BarbaraSierman - 13 Nov 2008 "operations"? Preservation actions? Administration=IT related actions?

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

Log detailing ongoing maintenance/checking of referential integrity the integrity of the data and its relationships to the associated descriptive information, especially following repair/modification of AIP; legacy descriptive metadatainformation; persistence of identifier/locator; documented relationship between AIP and metadatadescriptive information; system documentation and technical architecture; process workflow documentation.

-- MarkConrad - 14 Nov 2008 Is this phrase, "especially following repair/modification of AIP", referring to referential integrity or data integrity? relational integrity

Discussion

Repositories must implement procedures that let them know when the relationship between the data and the associated descriptive information is temporarily broken and ensure that it can be restored.

There may be times, depending on system design, when the repository cannot demonstrate referential integrity because some system component is out of action. However, repositories must implement procedures that let them know when referential integrity is temporarily broken and ensure that it can be restored.

-- MarkConrad - 14 Nov 2008 I do not understand the discussion above. Referential integrity is based on relationships between data elements - not the availability of a system component.

B6. Access management

-- MarkConrad - 22 Dec 2008 This section uses the term, "access" in a number of different ways - as does the OAIS. For example, access has the following meanings in the OAIS (and this is not an exhaustive list):

1. Access to the system (i.e.,user authentication, physical security, etc.) 2. Accessing the records (i.e., being able to find information about the existence of the records) 3. Accessing the records part 2 (i.e., preparing and sending the requester a DIP) 4. Accessing the records part 3 (i.e., does the requester have the right to access the records they are interested in)

For each requirement we need to make it clear which meaning of "access" that particular requirement is referring to.

-- JohnGarrett - 8 Dec 2008 - At meeting we agreed that this lead-in text for B6 needs to be examined after we have looked at and updated the B6 checklist items.

It must be understood that the capabilities and sophistication of the access system will vary depending on the repository’s designated community(ies) and the access mandates of the repository. Because of the variety of repositories, archives, and access mandates, these criteria may be subject to questions about applicability and interpretation at a local level. For in-depth discussion of access management issues, see Appendix 6: Understanding Digital Repositories & Access Functionality.

Repositories with a mandate to provide current access must be able to produce Dissemination Information Packages (DIPs) that meet the needs of their users or are appropriate to the levels of access they offer. “Dark” archives or national archives that may have mandates restricting access for a certain number of years will produce most DIPs for internal requirements, such as performing migrations, rather than for access. In any case, any repository must be able to produce a DIP, however primitive and whatever its purpose.

These requirements ensure that access is implemented according to the repository’s stated Preservation Ppolicies:

B6.1 to B6.4 are primarily concerned with access conditions and actions related to the designated community(ies); B6.5 and B6.6 are primarily concerned with access security, with a focus on internal (staff) access; B6.7 to B6.9 ensure that the access function is implemented correctly. Access should always deliver what is required, or else make clear that it is not possible for whatever reason. Timeliness may be measured in seconds or weeks, since access may be an online function or a postal function or may be mediated through some other mechanism or a combination of them. B6.10 adds a specific requirement over and above the need to simply provide access to a repository’s holdings. For the repository to be trusted, it must be able to provide a copy of material that can be traced back to originals.

Discussion
See 27A11.4 and 27A11.5

-- KatiaThomaz - 10 May 2007 - New item for “repository ensures access to digital objects by its designated community” (see NESTOR CCTDR 2.1) -- MarkConrad - 28 Sep 2007 This is covered by B. 6.1. -- DavidGiaretta - 01 Oct 2007 B 6.1 allows the possibility that the Designated COmmunity does not actually have access e.g. data which is currently secret right now, therefore there is no access, but it will become available in a few years and should be understandable then.

-- KatiaThomaz - 10 May 2007 - New item for “repository defines its DIPs” (see NESTOR CCTDR 11.1) -- MarkConrad - 28 Sep 2007 This is at least partially covered by B. 6.8.

-- KatiaThomaz - 10 May 2007 - New item for “repository ensures transformation of AIPs into DIPs” (see NESTOR CCTDR 11.2). -- KatiaThomaz - 21 Mar 2008 - Does it include definition for each DIP or class of information disseminated by the repository and how DIPs are constructed from AIPs?

An issue which several colleagues have spoken about is that of being able to demonstrate authenticity of the information. Perhaps when we speak about transfomrations we need to be clear that one should be able, if requested, to follow e.g. an audit trail in order to demonstrate authenticity. -- MarkConrad - 28 Sep 2007 This is already covered in B. 6.10!

B6.1 Repository documents and communicates to its designated community what access and delivery options are available.

-- DavidGiaretta - 15 Oct 2008 - not sure if repository has to do more than just document this

Supporting Text

The repository must have publicly available Access Ppolicies that document the various aspects of access to and delivery of the preserved information. This is necessary in order to for users to know how, when and how much it will cost to obtain information from the repository.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

_Public versions of access policies, delivery policies, fee policies

-- DavidGiaretta at RAC - 15 Oct 2008 - not clear these policies have to be public in all cases

Discussion

Repositories may have a single policy to address a single, homogeneous community or multiple policies that address several, disparate communities. Repositories may need to develop a separate policy to address the handling of a specific collection. This may require different policies for different communities. When there are multiple access policies for the same repository all policies mustshould be made available, as appropriate.

B6.2 Repository has implemented a documented policy for recording all access actions (includes requests, orders etc.) that meet the requirements of the repository and information producers/depositors.

-- DavidGiaretta at RAC - 15 Oct 2008 - not clear that all access actions need to be recorded from the point of view of preservation. Also the supporting text does not imply all access need to be recorded. Do we need a requirement that says the policy is actually implemented?

Supporting Text

The repository must document what usage information it records about use of the repository content. This is necessary in order to meet the requirements of the repository and information producers and users.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

Access Policies, use statements, usage statistics.

Discussion

If necessary, a policy for recording actions (includes requests, orders etc) should be established and implemented. A repository need only record actions that meet the requirements of the repository and its information producers/depositors. This may mean that little or no information is recorded about access. That is acceptable if the repository can demonstrate that it does not need to do more. The policy should include what figures are being monitored. If statistics are produced they should be made available.

B6.3 Repository ensures that agreements applicable to access conditions are adhered to.

-- DavidGiaretta at RAC - 15 Oct 2008 - not clear this is a preservation issue

Supporting Text

_The repository must demonstrate that producer/depositor agreements are satisfied for each AIP. This is necessary in order to ensure the producer/depositors access conditions are being met.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

_Access policies, logs of user access and user denials, access system mechanisms that prevent unauthorized actions (such as save, print, etc.); user compliance agreements.

Discussion

_Access conditions are most often about whom is allowed to see what content, but they can be more complex. They may involve limitations on quantities, for example, all members of a certain community are permitted to access 10 items a year without charge. Or they may involve limits on usage for commercial gain but not private usage. Other scenarios include if a repository’s material is open access, the repository can simply demonstrate that access is truly available to everyone. If all material in the repository is available to a single, closed community, the repository must demonstrate that it validates that users are members of that community, perhaps by requesting some proof of identity before registering them, or just by restricting access by network addresses if the community can be identified in that manner. It should also demonstrate that all members of the community can indeed gain access if they wish. If different access conditions apply to different AIP’s, the repository must demonstrate how these are realized. If access conditions require users to make some declaration before receiving DIPS’s, the repository must show that the declarations have been made. These may be signed forms, or evidence that a statement has been viewed online and a button clicked to signify agreement. The declaration might involve nondisclosure or agreement to no commercial use, for instance. ...

B6.4 Repository has documented and implemented access policies (authorization rules, authentication requirements) consistent with deposit agreements for stored objects.

-- DavidGiaretta at RAC - 15 Oct 2008 - this looks like a more detailed restatement of B6.3. Maybe B6.3 should be "you have a policy to stick to deposit agreemnts" and B6.4 "you implement those policies"

Supporting Text

The repository must have Preservation Ppolicies and mechanisms in place for the accessing of stored objects by both staff and authorized users. This is necessary in order to ensure the producer/depositors access conditions are being met and protect stored objects against accidental of deliberate damage by staff.

-- DavidGiaretta at RAC - 15 Oct 2008 - avoiding accidental damage by staff is more than just against deposit agreements

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

_Access validation mechanisms within the system, documentation of authentication and validation procedures

...

Discussion

_User credentials are only likely to be relevant for repositories that serve specific communities or that have access restrictions on some of their holdings. A user credential may be as simple as the IP address from which a request originates, or may be a username and password, or may be some more complex and secure mechanism. Thus, while this requirement may not apply to some repositories, it may require very formal validation for others. The key thing is that the access and delivery policies are reflected in practice and that the level of validation is appropriate to the risks of getting validation wrong. Some of the requirements may emerge from agreements with producers/depositors and some from legal requirements. Repository staff will also need to access stored objects occasionally, whether to complete ingest functions, perform maintenance functions such as verification and migration, or produce DIPs. The repository must have Preservation Policies and Preservation Implementation Planspolicies and mechanisms to protect stored objects against deliberate or accidental damage by staff (see C3.3). ...

B6.5 Repository access management system fully implements access policy.

-- DavidGiaretta at RAC - 15 Oct 2008 - overlap with B6.4 since sticking to deposit agreements must be part of the access policy

Supporting Text

_The repository must demonstrate that a complete access management system, with all access policies, is implemented. This is necessary in order to ensure the repository has fully addressed all aspects of usage which might effect the trustworthiness of the repository

-- DavidGiaretta at RAC - 15 Oct 2008 - DS: use of trustworthiness is correct term here since it could be taken to mean doing a good job both at preservation and in performance/support of the user community

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

_Logs and audit trails of access requests; information about user capabilities (authentication matrices); explicit tests of some types of access. ...

Discussion

_Access may be managed partly by computers and partly by humans—checking passports, for instance, before issuing a user ID and password may be an appropriate part of access management for some institutions. ...

B6.6 Repository logs all access management failures, and staff review inappropriate “access denial” incidents.

Supporting Text

_The repository must demonstrate they regularly review anomalous or unusual usage and access management failures. This is necessary in order to identify security threats and access management system failures.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

_Access logs, capability of the system to use automated analysis/monitoring tools and generate problem/error messages; notes of reviews undertaken or action taken as a result of reviews. ...

Discussion
See 27A13.1.1

_A repository should have some automated mechanism to note anomalous or unusual denials and use them to identify either security threats or failures in the access management system, such as valid users being denied access. This does not mean looking at every denied access. This requirement does not apply to repositories with unrestricted access. ...

B6.7 Repository can demonstrate that the process that generates the requested digital object(s) (i.e., DIP) is completed in relation to the request.

Supporting Text

The repository must show a user that the DIP he receives has the same content as what was originally deposited, or an explanation as to why it is not available. This is necessary in order to ensure the user is getting what he requested.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

_System design documents; work instructions (if DIPS’s involve manual processing); process ...

Discussion
See 27A12.2.4? (see also B6.8?)

_It is acceptable that the user’s request cannot be satisfied if the explanation given to the user is reasonable. For example, resource shortages may mean a valid request cannot be satisfied. If part of the request cannot be satisfied, it is acceptable that the user receives a DIP containing the elements that can be provided, and the system makes clear that the request is only partially satisfied. It is unacceptable if the request is only partially satisfied and a partial DIP is generated, but the repository does not communicated to the user that it is a partial DIP. It is also unacceptable if the request is delayed indefinitely because something that is required, such as access to a particular AIP, is not available, but the user is not notified nor is there any indication as to when the conflict will be resolved. It is unacceptable if the user is told the request cannot be satisfied, implying nothing can be delivered, but actually receives a DIP, and is left unsure of its validity or completeness. ...

B6.8 Repository can demonstrate that the process that generates the requested digital object(s) (i.e., DIP) is correct in relation to the request.

-- DavidGiaretta at RAC - 15 Oct 2008 - DS notes that a better requirement may be that problem reports about errors in data responses from users are recorded (and acted on) since it would be very difficult to prove B6.7

Supporting Text
The repository must be able to demonstrate that the digital object(s) (i.e., DIP) provided to the user has had the appropriate transformations applied. This is necessary in order to ensure the user receives a usable and correct version of the digital object(s) (i.e., DIP) that they requested.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

_System design documents; work instructions (if DIPs involve manual processing); process walkthroughs; logs of orders and DIP production ...

Discussion

_A simple example is that if the repository stores TIFF images but delivers JPEGS, the conversion should be shown to be correct to whatever standards seem appropriate. If the repository offers delivery as JPEG or PNG, the user should receive the format requested. Many repositories may apply more complex transformations to generate DIPs from AIPs.

...

B6.9 Repository demonstrates that all access requests result in a response of acceptance or rejection.

-- DavidGiaretta at RAC - 15 Oct 2008 - not clear that this is related to preservation.

Supporting Text

_The repository must demonstrate that all access requests result in a response of acceptance or rejection within a given amount of time. This is necessary ?Is this covered in B6.6-7?

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

_System design documents; work instructions (if DIPs involve manual processing); process walkthroughs; logs of orders and DIP production ...

Discussion

_Eventually a request must succeed or fail, and there must be limits on how long it takes for the user to know this. Access logs are the simplest way to demonstrate response time, even if the repository does not retain this information for long. However, a repository can demonstrate compliance if it can show that all failed requests result in an error log of some sort, and that requests are bounded in duration in some way. ...

B6.10 Repository enables the dissemination of authentic copies of the original or objects traceable to originals.

-- DavidGiaretta at RAC - 15 Oct 2008 - OAIS update now uses "enable the information to be disseminated as copies of, or as traceable to, the original submitted Data Object., with evidence supporting its Authenticity." so we should change the requirement to be consistent with this. Note that "Repository follows policies which enable..." would make more sense.

Supporting Text

The repository must demonstrate that it can trace in some auditable way the DIP it disseminates back to an authentic copy of the original object(s) (DIP’s). This is necessary to ensure the DIP that is disseminated is an authentic copy of the original object (s) (AIP’s)

-- DavidGiaretta at RAC - 15 Oct 2008 - should change "authentic" to something like "with evidence supporting its authenticity" in the remainder of this sub-section. Use of "authenticated" and "authentication" is wrong below.

Examples of Ways the Repository can Demonstrate it is Meeting this Requirement

_System design documents; work instructions (if DIPs involve manual processing); process walkthroughs; production of a sample authenticated copy; documentation of community requirements for authentication ...

Discussion

_This distinction is made because objects are not always disseminated in the same way, or in the same groupings, as they are deposited. For example, a database, which originally had subsets of its rows, columns, and tables, may be disseminated without the original formatting, so that the phrase “authentic copy” has little meaning. Ingest and preservation actions may change the formats of files, or may group and split the original objects deposited. The distinction between authentic copies and traceable objects may also be important when transformation processes are applied. For instance, a repository that stores digital audio from radio broadcasts may disseminate derived text that is constructed by automated voice recognition from the digital audio stream. Derived text may be imperfect but useful to many users, though these texts are not authentic copies of the original audio. Producing an authentic copy means either handing out the original audio stream or getting a human to verify and correct the transcript against the stored audio. This requirement ensures that ingest, preservation, and transformation actions do not lose information that would support an auditable trail between the original deposited object and the eventual disseminated object. For compliance, the chain of authenticity need only reach as far back as ingest, though some communities, such as those dealing with legal records, may require chains of authenticity that reach back further. A repository should be able to demonstrate the processes to construct the DIP from the relevant AIP(s). This is a key part of establishing that DIPs reflect the content of AIPs, and hence of original material, in a trustworthy and consistent fashion. DIPs may simply be a copy of AIPs, or may result from a simple format transformation of an AIP. But in other cases, they may be derived in complex ways from a large set of AIPs. A user may request a DIP consisting of the title pages from all e-books published in a given period, for instance, which will require these to be extracted from many different AIPs. A repository that allows requests for such complex DIPs will need to put more effort into demonstrating how it meets this requirement than a repository that only allows requests for DIPs that correspond to an entire AIP. A repository is not required to show that every DIP it provides can be verified as authentic at a later date; it must show that it can do this when it is required at the time of production of the DIP. The level of authentication is to be determined by the designated community(ies). This requirement is meant to enable high levels of authentication, not to impose it on all copies, since it may be an expensive process.

...

Edit | Attach | Watch | Print version | History: r92 < r91 < r90 < r89 < r88 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r92 - 2009-02-10 - DavidGiaretta
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback