Notes from Megameeting 15th October 2007

Attendees:

BarbaraSierman Koninklijke Bibliotheek, Netherlands
BruceAmbacher UM
CandidaFenton HATII, Univ Glasgow
SimonLambert STFC
DonaldSawyer NASA GSFC
HelenTibbo UNC
KatiaThomaz INPE, Brazil
MarkConrad NARA
RiccardoFerrante  
RobertDowns Center for International Earth Science Information Network (CIESIN), U Columbia

All the discussion at this meeting was conducted by chat, so the following transcript of the meeting (with a few typos corrected) is complete.

Conclusion of the discussion: DonaldSawyer to expand on what TRAC intended when using "authenticity" or related terms.

Also SimonLambert to extend the table of availability for meetings.

Candida Fenton >> (All): Hello everyone
BruceAmbacher >> (All): welcome
Helen Tibbo >> (All): Candida, nice summary
Candida Fenton >> (All): I hope it is useful
cclrc >> (All): Candida, do you want to say a few words to introduce yourself?
cclrc >> (All): (or type a few words)
Candida Fenton >> (All): I'm working in Glasgow at Hatii with Seamus Ross and Perla Innocenti
cclrc >> (All): OK, we have a good number now.  Mark - you circulated proposed definitions of authenticity and other terms.
Mark Conrad >> (All): Yes. I just received an e-mail message from Don Sawyer objecting to them.
BruceAmbacher >> (All): Has everyone had a chance to see Don Sawyer's comments on email?
cclrc >> (All): I don't think so
cclrc >> (All): Was the email sent to the list?
RobertDowns >> (All): I received it
BarbaraSierman >> (All): i just received it a minute ago
BruceAmbacher >> (All): To me Don wants a less absolute definition of authenticity and to avoid many other definitions
Mark Conrad >> (All): It waas sent to the list.
Riccardo Ferrante >> (All): Just received it.
BruceAmbacher >> (All): Don has joined us.  Do we want an oral explanation?
Mark Conrad >> (All): Yes.
Don Sawyer >> (All): I'm not sure what more I can say.
BruceAmbacher >> (All): Yes
BarbaraSierman >> (All): yes
Katia Thomaz >> (All): i suggested new terms directly in the glossary. they were accessibility, authentication, authorization, availability, interpretability, integrity, interpretability, readability, understandability and usability.
BruceAmbacher >> (All): Here is where I will agree with Don - to make glossary definitions of all of these terms is risky and may put us at odds with others.
Don Sawyer >> (All): Rather than just defining a whole list of terms, i think we need to see where in the document we need to use a term and does it really need a special definition.
Mark Conrad >> (All): We need to define how we are using these terms because they are used by so many others.
cclrc >> (All): Do we need to select a small subset of these terms, definie them clearly, and make sure we use them consistently ourselves?
cclrc >> (All): ... rather than try to define all terms that might be used.
Don Sawyer >> (All): Again,  we can spend weeks agreeing on definitions.  Where are special definitions really needed?  We're not going to define all the terms we use.
Mark Conrad >> (All): I do not believe this group has a mutual understanding of how these terms are being used in this document.
BruceAmbacher >> (All): I think we are having a basic conflict between repository types - traditional archives, data centers, digitial libraries, etc.  We are trying blend communities and that will possibly require a crosswalk where necessary and use of generally accepted terms elsewhere
Candida Fenton >> (All): Authenticity and reliability are key concepts which would be useful to define
Katia Thomaz >> (All): i received an assignment to define authorization, integrity, readability. do you remember? but i found terms related in the source documents.
Mark Conrad >> (All): Bruce, We are using terms that are used by a large number of communities with different understanding of the meaning of the terms. If we do not define how we are using them who will be able to use the document effectively?
Helen Tibbo >> (All): Don, can you talk a little more about defining in relation to how/where the term is used in the document? 
Mark Conrad >> (All): We need to define these terms and use them consistently throughout the document or it will be gibberish.
Don Sawyer >> (All): Yes, I agree with both Candida and Katia, but still doesn't negate my point, I think.  When a term is widely used in dfferent communities in different ways, it is better to avoid the term or define a new term.  That is the experience of using standards. This means you have to look at where you use the term and not just try to define it in the abstract.
BarbaraSierman >> (All): The way a trusted repository can prove that it meets the required standards related to the definition of something might clarify what we mean by the definition, so I would think it is an idea to keep in mind how to prove that you implemented the definition
Don Sawyer >> (All): Let's take one term and see where it is used, and what we think about it.
Mark Conrad >> (All): I sent out an analysis of where authenticity is used 2 weeks ago.
Katia Thomaz >> (All): i will send you my document. i compared trac, nestor, drambora, oecd guidelnes, iso 27001, iso 15489, interpares, general dicinary, computer dicionary, law dictionary, and wikipedia
Mark Conrad >> (All): Don,
Mark Conrad >> (All): Pick a term.
Don Sawyer >> (All): Mark, can you refresh us on this?  I assume ou are talking about its use in TRAC.
Mark Conrad >> (All): I am.
Mark Conrad >> (All): Just a minute I will retrieve it from my e-mail.
Mark Conrad >> (All): Use of the term "authentic" and "authenticity" in the TRAC documentB. Digital Object Management• B6: The repository’s ability to produce and disseminate accurate, authentic versions of the digitalobjects. (pg 21)B6.10 Repository enables the dissemination of authentic copies of the original or objectstraceable to originals.Part of trusted archival management deals with the authenticity of the objects that are disseminated. Arepository’s users must be confident that they have an authentic copy of the original object, or that it istraceable in some auditable way to the original object. This distinction is made because objects are notalways disseminated in the same way, or in the same groupings, as they are deposited. A database mayhave subsets of its rows, columns, and tables disseminated so that the phrase “authentic copy” has littlemeaning. Ingest and preservation actions may change the formats of files, or may group and split theoriginal objects deposited.The distinction between authentic copies and traceable objects can also be important when transformationprocesses are applied. For instance, a repository that stores digital audio from radio broadcasts maydisseminate derived text that is constructed by automated voice recognition from the digital audio stream.Derived text may be imperfect but useful to many users, though these texts are not authentic copies of theoriginal audio. Producing an authentic copy means either handing out the original audio stream or gettinga human to verify and correct the transcript against the stored audio.This requirement ensures that ingest, preservation, and transformation actions do not lose information thatwould support an auditable trail between the original deposited object and the eventual disseminatedobject. For compliance, the chain of authenticity need only reach as far back as ingest, though somecommunities, such as those dealing with legal records, may require chains of authenticity that reach backfurther.A repository should be able to demonstrate the processes to construct the DIP from the relevant AIP(s).This is a key part of establishing that DIPs reflect the content of AIPs, and hence of original material, in atrustworthy and consistent fashion. DIPs may simply be a copy of AIPs, or may result from a simpleformat transformation of an AIP. But in other cases, they may be derived in complex ways from a largeset of AIPs. A user may request a DIP consisting of the title pages from all e-books published in a givenperiod, for instance, which will require these to be extracted from many different AIPs. A repository thatallows requests for such complex DIPs will need to put more effort into demonstrating how it meets thisrequirement than a repository that only allows requests for DIPs that correspond to an entire AIP.A repository is not required to show that every DIP it provides can be verified as authentic at a later date;it must show that it can do this when it is required at the time of production of the DIP. The level ofauthentication is to be determined by the designated community(ies). This requirement is meant to enablehigh levels of authentication, not to impose it on all copies, since it may be an expensive process.Evidence: System design documents; work instructions (if DIPs involve manual processing);process walkthroughs; production of a sample authenticated copy; documentation of communityrequirements for authentication.(pp 41-42)C1.6 Repository reports to its administration all incidents of data corruption or loss, andsteps taken to repair/replace corrupt or lost data.Having effective mechanisms to detect bit corruption and loss within a repository system is critical, but isonly one important part of a larger process. As a whole, the repository must record, report, and repair aspossible all violations of data integrity. This means the system should be able to notify systemadministrators of any logged problems. These incidents, recovery actions, and their results must bereported to administrators and should be available.For example, the repository should document procedures to take when loss or corruption is detected,including standards for measuring the success of recoveries. Any actions taken to repair objects as part ofthese procedures must be recorded. The nature of this recording must be documented by the repository,and the information must be retrievable when required. This documentation plays a critical role in themeasurement of the authenticity and integrity of the data held by the repository.Evidence: Preservation metadata (e.g., PDI) records; comparison of error logs to reports toadministration; escalation procedures related to data loss.(pg 45)Use of the term "authenticate" in the TRAC documentB1.3 Repository has mechanisms to authenticate the source of all materials.The repository’s written standard operating procedures and actual practices must ensure the digital objectsare obtained from the expected source, that the appropriate provenance has been maintained, and that theobjects are the expected objects. Confirmation can use various means including, but not limited to, digitalprocessing and data verification and validation, and through exchange of appropriate instrument ofownership (e.g., submission agreements/deposit agreement/deed of gift).Evidence: Submission agreements/deposit agreements/deeds of gift; workflow documents; evidence ofappropriate technological measures; logs from procedures and authentications.Use of the term "authentic" and "authenticity" in the OAIS Reference Model (Blue Book, January 2002)The term "authentic" is not used in this document.The term "authenticity" is used twice in this document.2 OAIS CONCEPTSThe term ‘archive’ has come to be used to refer to a wide variety of storage and preservationfunctions and systems. Traditional archives are understood as facilities or organizationswhich preserve records, originally generated by or for a government organization, institution,or corporation, for access by public or private communities. The archive accomplishes thistask by taking ownership of the records, ensuring that they are understandable to theaccessing community, and managing them so as to preserve their information content andauthenticity. Historically, these records have been in such forms as books, papers, maps,photographs, and film, which can be read directly by humans, or read with the aid of simpleoptical magnification and scanning aids. The major focus for preserving this information hasbeen to ensure that they are on media with long-term stability and that access to this media iscarefully controlled.(pg 2-1)Table 4-1: Examples of PDI Types(pg 4-29)The TRAC document uses the term "authenticate" in two different ways:1. To authenticate a user of the archive (i.e., to verify the credentials and authorizations of a person attempting to use the repository).2. To verify the authenticty of an object in the repository.Neither document offers a definition of authentic or authenticity. Based on its usage in the TRAC document one could deduce:Authenticity and integrity are separate concepts.An object may be traceable to an original without being an authentic copy of the original.An authentic copy of a digital object is one that has not lost any of the information (the message) conveyed by the original.One must maintain the chain of custody (referred to in the TRAC document as the "chain of authenticity") of the original object to be able to produce authentic copies of that original or objects traceable to that original..
BruceAmbacher >> (All): Mark originally sent his email on 9/24
Mark Conrad >> (All): If you go to the list archives you can see the document with formatting.
BruceAmbacher >> (All): I am printing it from the email.
Don Sawyer >> (All): O'K, the first colmes from B6 and the second from B10.  I believe both stem from the use of 'authenitic' in the OAIS reference model.  There was no special meaning intended bythis use of 'authentic'.  It is the subject of an update in the OAIS review of 5-year comments.  In any event, I b elieve the intent was just to say that whatever the archive received, there should be a reliaable way to trace back to this copy as received.
Mark Conrad >> (All): OAIS does not use the term authentic or authenticity.
Katia Thomaz >> (All): i vote to adopt iso 15489 definition for authenticiy changing "record" by "data object".
Mark Conrad >> (All): Than it what I proposed last week.
cclrc >> (All): And replace "prove" by "demonstrate"?
BruceAmbacher >> (All): If we change it we cannot cite ISO as the source.
Katia Thomaz >> (All): i agree with you.
Katia Thomaz >> (All): ok.
Don Sawyer >> (All): I'm  sorry, but that definition will not work.  You've not addressed my point on 'prove', for starters.
Mark Conrad >> (All): If you use the ISO definintion as is, it applies to records only.
Katia Thomaz >> (All): why do you think so?
cclrc >> (All): The original definition says "an authentic record is one that can be proven to ..."
BruceAmbacher >> (All): first, because that is how it is written/defined and second this ISO is the records management standard which is why Mark was a little reluctant to use it from the get go
Katia Thomaz >> (All): general dictonary: The quality or condition of being authentic, trustworthy, or genuine.
Don Sawyer >> (All): Mark, OAIS responsibility in section 3 (5th, I think) says: "Follow documented policies and procedures which ensure that the information is preserved against all reasonable contingencies, and which enable the information to be disseminated as authenticated copies of the original, or as traceable to the original."
Katia Thomaz >> (All): wikipedia: Within the arts, archaeology, the study of antiques, and similar fields involving unique or scarce artifacts from the past, and, with regard to documents in law, authenticity refers to the truthfulness of origins, attributions, commitments, sincerity, devotion and intentions; not a copy or forgery. See also provenance.
Mark Conrad >> (All): Don, Exactly! That is the only use of the term and authenticate is used in the TRAC document with two different meanings.
Mark Conrad >> (All): OAIS does not address authenticity.
BruceAmbacher >> (All): to authenticate is not the same having authenticity
Katia Thomaz >> (All): computer dictionary: The correct attribution of origin such as the authorship of an e-mail message or the correct description of information such as a data field that is properly named. Authenticity is one of the six fundamental components of information security.
BarbaraSierman >> (All): Sorry, need to go
Mark Conrad >> (All): Bruce, What I meant to say is that OAIS does not use the term authenticity. The only reference is the one that Don cited to an authenticated copy. I agree this reference is referring to the authenticity of the copy produced, but the OASI does not address what authenticity is or how you establish it.
Katia Thomaz >> (All): nestor: p. 17 Authenticity here means that the object actually contains what it claims to contain.p. 17 This is achieved to some extent by documenting all changes to the objects (including metadata) (see 12.4).p. 17 Authenticity means that the producer or sender and the given production or transmission time correspond to the facts. For example, that an email supposedly generated and transmitted by a particular person at a particular time is actually from this person and was sent at the given time.p. 33 The object actually contains what it claims to contain.
BruceAmbacher >> (All): Where do we find the wisdom of Solomon?  We have been going at this issue of acceptable definitions for weeks.
Don Sawyer >> (All): O'K, as we struggle to communicate, how about this.  OAIS does not try to address whether information coming to the archive is 'what it porports to be'. This is important, but it left to others to addresss.  It does, however, try to ensure that waht is preserved and what is disseminated is traceable to what was originally received.  Is anyone arguing that TRAC or this certifications should try to address whether the information received by the archive needs to be examined to ensure that it is what it purports to be?
Mark Conrad >> (All): The central points that I would raise are: 1.If we are going to use loaded terms like authenticity we need to define how we are using them. 2. We need to use the terms consistently within the document.
Don Sawyer >> (All): And, I'm trying to see if we need to use the term!
RobertDowns >> (All): Perhaps we should refer to chain of custody
BruceAmbacher >> (All): I do not think any repository can do more than describe its sources, how they produce data, the past reliability of that data, and their commitment to carry that data forward - as received or as modified and documented
Katia Thomaz >> (All): nestor definition is quite the same of ISO15489.
Mark Conrad >> (All): Chain of custody and authenticity are not synonmous and chain of custody in the archival world extends back beyond the point of ingest.
BruceAmbacher >> (All): You have a chain of custory for an "unauthentic" object.
Don Sawyer >> (All): I agree with Bruce, and that means there is NO chance of PROVING that the information is what it purports to be.
Katia Thomaz >> (All): but you can assess the provenance.
Mark Conrad >> (All): We can't prove the Constitution is authentic?
Don Sawyer >> (All): From most archives, one is lucky if one can get much provenance prior to receiving the information.
Mark Conrad >> (All): How are you defining archives? 
Mark Conrad >> (All): Provenance is one of the founding principles of archives.
Don Sawyer >> (All): Regarding the constitution, you can only give evidence - then it is a judgement.
Candida Fenton >> (All): I agree that PROVE is too strong
Don Sawyer >> (All): Mark, we are not dealing only with national archive, but also libraries, repositories, data centers.
Katia Thomaz >> (All): i agreed with demonstrate...
BruceAmbacher >> (All): Here is the divergence between "traditional archives" and other types of digital repositories in terms of their heritage, their practices, what they demand or tolerate from producers.
Mark Conrad >> (All): Don,
Don Sawyer >> (All): yes, and the Certification document needs to be applicabel to all
Mark Conrad >> (All): Those other institutions are not archives in my understanding of that word.
BruceAmbacher >> (All): Agreed to both Don and Mark (that does not seem inconsistent to me)
Don Sawyer >> (All): Well, look at the TRAC document. It uses the term 'repository' for the reason you give, Mark. 
Mark Conrad >> (All): If we do not have a common underderstanding of terms can we eliminate the terms from the document?
Katia Thomaz >> (All): i think it's possible only using authenticate
Don Sawyer >> (All): Assuming that is not a tongue in cheek comment, that is really my point.  If a term is too  controversial, then we should avoid it and/or define a new term.
Katia Thomaz >> (All): sorry, authentication.
BruceAmbacher >> (All): I think we need to start from the strongest tradition, use its definitions, and create/explain only where we can't accept pre-existing terms.  Authenticate is very different from authentocity
Mark Conrad >> (All): Authenticate has at least two different meanings in the TRAC document.
BruceAmbacher >> (All): That's authenticity
Katia Thomaz >> (All): but the repositoy will authenticate the material
BruceAmbacher >> (All): NO NO NO
Mark Conrad >> (All): What does authenticate the material mean to you Katia?
Katia Thomaz >> (All): Authentication is the act of establishing or confirming something or someone as authentic, that is, that claims made by or about the thing are true. Authenticating an object may mean confirming its provenance, whereas authenticating a person often consists of verifying their identity.
BruceAmbacher >> (All): It can only describe what it received and the process whereby it received it.  Proving provenance or identity still does not make information authentic or accurate.
Mark Conrad >> (All): Katia, I believe that Don and Bruce are saying that a repository may not necessarily authenticate what it receives.
Katia Thomaz >> (All): sorry, i didn't understand the point.
BruceAmbacher >> (All): A repository can match the data to the metadata and documentation but it can not prove the accuracy of any of the data beyond that
Don Sawyer >> (All): Katia, yes, that is what I'm saying.  Beyond verify the source of the information, and a statement as to what the information is, often that is as far as many archives/repositories can go.
Katia Thomaz >> (All): but it must analyse the evidences...
Mark Conrad >> (All): Not all repositories do carry out such an analysis.
HRT >> (All): Don't we get trust from the fact that good and extensive documentation is kept?
Don Sawyer >> (All): Katia, sure, if there seems to be some discrepancy in what is received, that would be addressed.
Don Sawyer >> (All): And yes, some archives/repositories do not even do that.
Don Sawyer >> (All): It is a different matter as to how the archive/repository handles what it received.  That, I believe, is the whole point of the certification process.
BruceAmbacher >> (All): HRT you are assuming the creator is honest.  That comes with the extent of your relationship with that creator
BruceAmbacher >> (All): Don. and the lesser repositories may not be able to be certified because they do not validate their data - or they may if that is what their SOP calls for and their user community accepts.
Mark Conrad >> (All): Don, If we are going to eliminate the loaded terms from this document, how would you suggest that we proceed with providing atlternative text?
Don Sawyer >> (All): I think this gets into what is mandatory and how archive/repositories will get scored.
Don Sawyer >> (All): Mark, I think we need to look at where we find loaded terms and see if we can reach agreement on what is intended.  Then we can either eliminate the term or define a new term to have the meaning we need. 
HRT >> (All): I want to feel confident that the repository is keeping materials they way they received them to the extent possible and are documenting changes they make
Don Sawyer >> (All): HRT - agreed!
Katia Thomaz >> (All): of course.
RobertDowns >> (All): I agree as well
HRT >> (All): There are no surities here - that's where the notion of level of risk comes in
JohnGarett >> (All): and I
BruceAmbacher >> (All): or use an acceptable term supported by a community of practice
Mark Conrad >> (All): So does that mean we need to go back through the document and remove the loaded terms where we find them? I was looking for a suggestion for the process of removing these terms.
Don Sawyer >> (All): Bruce - yes, when it fits our understanding of what we want to say.
Mark Conrad >> (All): If you use one communities definintion of a term you will confuse other communities.
HRT >> (All): Are we talking about the original document or the new on e we have created
BruceAmbacher >> (All): But inventing a new definition confuses all
Mark Conrad >> (All): The new one.
Mark Conrad >> (All): Then don't create a new definition. Be explicit with what you mean at that point in the text.
HRT >> (All): Is there a version of the new one in which the changes are incorporated?
JohnGarett >> (All): Yes I think you go through and eliminate use of loaded terms at least ones that many of us think are loaded and text does not convey what we want when loaded terms are used
Katia Thomaz >> (All): can we  get definitions from other areas?
Don Sawyer >> (All): Perhaps this started with 'authenticity'.  Why not start here and look at its use and what we think was meant in TRAC.  Bruce and David and I participated in TRAC, so we should be able to get some consensus on that. Then, is that consensus was we still think is needed?  Do we need to do a rewrite?  This is my proposed approach.
cclrc >> (All): Don, that looks sensible to me.  Mark has already isolated the occurrences of the word.
JohnGarett >> (All): I think we use definitions from other areas if the definitions work for us.  We should always try to adopt first, then adapt an existing one, and finally create a new definition or term if nothing useful exists.
BruceAmbacher >> (All): Perhaps Bruce, Don and Mark could meet in the very near future to search for consensus?
Don Sawyer >> (All): I can take a stab at expanding on what I think was meant and Bruce and David can critique my understanding.
cclrc >> (All): I wonder if the edited doc should have an introductory discussion of authenticity - even if thw word is not actually used in the end.
RobertDowns >> (All): Sounds like a plan
cclrc >> (All): Yes, sounds good to me.
H >> (All): What is the role of examples in all this?
Katia Thomaz >> (All): ok for me too.
JohnGarett >> (All): I think the most useful tack would be to decide what we want to say now and decide if the current text does that.  If not, suggest an update.
Mark Conrad >> (All): There are two different proposals on the table which one are you all agreeing with?
Don Sawyer >> (All): We might want to have a discussion on authenticity.  I also want to review the KB material that Barbara pointed out.
H >> (All): What is the role of examples in all this?
BruceAmbacher >> (All): Examples can become dated, should be in an annex but they can lay a role
BruceAmbacher >> (All): examples have a role
cclrc >> (All): Examples can reassure different communities that they have been considered.
BruceAmbacher >> (All): Don, don't forget the crosswalk of the documents
Don Sawyer >> (All): Is it agreed that i should try to expand on what TRAC meant by 'authenticity' or related terms?
Mark Conrad >> (All): Why do you want to have a discussion of authenticity in the document? That is one of the most loaded/overused/undefined terms out there.
BruceAmbacher >> (All): I must sign off
cclrc >> : Sure ... but the trend of our discussion seems to be to eliminate the use of the word in our doc, whereas other somewhat related docs do use it, so perhaps we ought to state our position on it, and explain why we are not using it.  Maybe I'm overcomplicating things ...
Mark Conrad >> (All): Do we have action items for next time?
Katia Thomaz >> (All): i must go now. i ask for a new "availability for meetings" table. bye for everyone and have a nice week.
cclrc >> (All): I noted an action on Don.
HRT >> (All): I will be at ASIST next week but hope to join you better than this week!
Mark Conrad >> (All): Simon will you capture the chat?
cclrc >> (All): Yes, certainly.
Mark Conrad >> (All): Ok. Thank you. See you all next week.
cclrc >> (All): I think that's it for today ... bye for now.

-- SimonLambert - 15 Oct 2007

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r2 - 2008-02-13 - KatiaThomaz
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback