Date: Thu, 13 Feb 1997 13:40:15 -0400 (AST) Subject: Category Theory and Databases Date: Thu, 13 Feb 1997 15:12:56 UTC From: Jose I. Bravo I am very interested in the application of Category Theory in the area of Databases. Could anybody help me to obtain references on this subject? Thanks in advance. -------------------------------------------------------------------- Jose I. Bravo phone: +34 +22 603306 Dpt. Estadistica, I.O. y Computacion fax: +34 +22 635358 Universidad de La Laguna email: jibravo@ull.es Tenerife (Canary Islands) SPAIN -------------------------------------------------------------------- Date: Sun, 16 Feb 1997 15:42:32 -0400 (AST) Subject: RE: Category Theory and Databases Date: Sat, 15 Feb 1997 17:51:16 GMT From: MHEBERT@acs.auc.eun.eg The following (and the references in it) may be useful: Volger, H., The semantics of disjunctive deductive databases, LNCS 440, 409-421 Spinger, Berlin 1990. Volger, H., Initial and quasi-initial models of theories, Fund.Inform. 18(1993) 339-362. Michel Hebert Date: Mon, 17 Feb 1997 10:44:21 -0400 (AST) Subject: Re: Category Theory and Databases Date: Mon, 17 Feb 1997 12:07:31 +0000 (GMT) From: David Pym There is some work by John Cartmell. He applied some ideas from his thesis --- about contextual categories, generalized algebraic theories and dependent type theory. The thesis was Oxford, '78, and some of it appeared in the Ann. P. Appl. Logic 32 (1986) 209-243. Unfortunately, the only reference to the databases applications that I can recall just now is: Formalizing the network and hierarchical data models --- an application of categorical logic, LNCS 240, 1986. Thomas Streicher may perhaps know more ? David > Date: Thu, 13 Feb 1997 15:12:56 UTC > From: Jose I. Bravo > > I am very interested in the application of Category Theory in the > area of Databases. Could anybody help me to obtain references on > this subject? Date: Tue, 18 Feb 1997 16:01:43 -0400 (AST) Subject: Re: Category Theory and Databases Date: Tue, 18 Feb 1997 20:35:22 +0000 From: Zinovy Diskin > I am very interested in the application of Category Theory in the > area of Databases. Could anybody help me to obtain references on > this subject? > There are a few papers written by mathematicians, they are rather far from the hot DB problems. There are several papers written by DB theorists, CT is used there as a kind of decoration rather than essentially. There are a few works where CT is really employed for solving real DB problems. In summer of 1994 Boris Kadish and I wrote a declarative text called "Algberaic Graph-Oriented = Category-Theory-Based: Manifesto of categorizing database theory " (its last updated version can be found on ftp: //ftp.cs.chalmers.se/pub/users/diskin/mnfst4.* ), in which there is a list of about 30 titles on CT & DB. The manifesto and other our papers on CT & DB are oriented mainly on DB people but can be of some interest for CT people as well. The machinery we use is a generalization of Makkai's sketches. My last effort in the direction "CT for DB" is the paper "Variable set semantics for generalized sketches: Why ER is more object-oriented than OO" (on ftp: //ftp.cs.chalmers.se/pub/users/diskin/ERvsOO.* ) submitted to 'Data & Knowledge Engineering'. Zinovy Diskin Date: Tue, 18 Feb 1997 16:00:19 -0400 (AST) Subject: RE: Category Theory and Databases Date: Tue, 18 Feb 1997 11:49:29 GMT From: Khazem > From cat-dist@mailserv.mta.ca Sun Feb 16 20:42 WET 1997 > Date: Sun, 16 Feb 1997 15:42:08 -0400 (AST) > From: categories > To: categories > Subject: RE: Category Theory and Databases > Mime-Version: 1.0 > > Date: Sat, 15 Feb 1997 17:51:16 GMT > From: MHEBERT@acs.auc.eun.eg > > The following (and the references in it) may be useful: > Volger, H., The semantics of disjunctive deductive databases, LNCS 440, 409-421 > Spinger, Berlin 1990. > Volger, H., Initial and quasi-initial models of theories, Fund.Inform. 18(1993) > 339-362. > Michel Hebert > > A list of papers about category theory and database: I- Papers by Peter Buneman, Val Tannen and others in Univ. of Pennsylvania. Using theory of monads for query languages. see references in their paper: Principle of Programming with Complex objects and Collection Types. TCS 149:3-48, 1995. II- Paper by S.K. LELLAHI, Nicolas Spyratos: Using sketch for modelling data 1- Towards a categorical Data Model supporting structured Object and Inheritance. LCNS NO 504, pp 86-105, 1991 2- Categorical Modelling of database concepts. Technical report Series, FIDE/92/ 38, University of Glasgow 1992 3- Deduction over graphs under constraint: A soundness and completness theorem. Diagramme vol 39, 1993 (edited by: Univ Paris 7, France). 4- Research reports, NOs 619, 410, 908 LRI, Univ. Paris 11 (Orsay), France. 5- Research reports, NO, 93-05 An Algebraic Semantics for Data Modelling under constraints LIPN, Univ. Paris 13 Villetaneuse), France. III- Paper by S.K. LELLAHI, Monads and query languages 1- Towards a Characterization of Bulk types, Research reports, NOs , LIPN, Univ. Paris 13 Villetaneuse), France. 2- Type de collection et Monades. Acte des journees Categories, Algebres, Esquisses et Neo_esquisses, 1994, Univ. de CAEN, France IV- Others 1- E. Lippe & al a) A Category theory approach to coceptual data Modelling, Theorical information and Application, vol 30, nO 1, 1966, pp 31_97. b) Conceptual data modelling from a Categorical Perspective. The Computer Journal, Vol 39, No 3 1966. ------------------------------------------------------- Kazem LELLAHI Date: Thu, 20 Feb 1997 14:13:19 -0400 (AST) Subject: Re: Category Theory and Databases Date: Wed, 19 Feb 1997 19:01:48 +0000 From: Nick Rossiter At Newcastle University, we have been studying the application of category theory to a number of information system problems, including the formalization of object-relational databases. Further information, including citations to some publications, can be found at: http://www.cs.ncl.ac.uk/people/b.n.rossiter/home.informal/ Date: Fri, 21 Feb 1997 12:27:19 -0400 (AST) Subject: Re: Category Theory and Databases Date: Fri, 21 Feb 1997 11:44:35 +0100 From: Frank Piessens In Leuven, we have studied semantic data-specifications from a categorical perspective. We have studied sketches as a formalism for making semantic data-specifications (somewhat similar to the approach taken by Diskin et al.), and have developed and proved correct a number of interesting algorithms to decide semantical equivalence of such specifications. The following publications describe our work: Frank Piessens, Eric Steegmans. ``Categorical data-specifications'', {\em Theory and Applications of Categories}, Vol. 1, No. 8, 1995, pp. 156--173. Frank Piessens, Eric Steegmans. ``Proving semantical equivalence of data specifications'', accepted for publication in the special issue of {\em The Journal of Pure and Applied Algebra} dedicated to the celebration of Peter Freyd's 60th birthday. Frank Piessens. ``Semantic data specifications: an analysis based on a categorical formalization'', PhD thesis, Dept. of Computer Science, Katholieke Universiteit Leuven, 1996. Frank Piessens. Date: Thu, 27 Feb 1997 15:54:11 -0400 (AST) Subject: Re: Category Theory and Databases Date: Wed, 26 Feb 1997 16:12:56 +0000 From: Zinovy Diskin I'd like to add some general words to the recent exchange of references on CT \& DB, and apologize in advance for the long reply. Texts on CT \& DB can be roughly divided into two groups. In the first one there are papers (papers-I) where a certain categorical apparatus is applied to a particular DB theory problem, for example, monads to modeling bulk data types, or the pull-back treatment of relational join operation, or the functorial treatment of relational algebra via indexed Boolean algebras and so on. Papers-II are those which pursue a more fundamental goal of stating an integrated framework in which DB theory (or its major components) can be developed. It may sound strange but it seems that papers-II potentially can be much more useful in practice. Indeed, it is commonly recognized that the entire field of software engineering suffers from lack of suitable specification languages. A particular but remarkable example can be found in the concluding report to a recent high-level workshop (organized by an industry company!) where the abstract Wittgenstein's philosophy was invoked: \begin{quote} ... It is clear that the problem of identifying and resolving semantic heterogeneity in a heterogeneous database environment is far from solved, and that the problem itself is still at the stage of being understood. ... Wittgenstein said long ago that if an idea was worth discussing, then it was worth having a language to discuss it. ...Perhaps, one of the most difficult problems in developing integrated systems is in figuring out what schemas, data, and application code mean. \end{quote} In this context, what is required first of all is a general language which allows to specify basic DB theory concepts like database schema, database state, query etc in a way independent of any specific data model (rather than a single universal data model capturing all other data models). For a DB theorist it seems impossible to speak substantially about database schemas and database states without a specific description of what they are: it appears to be a kind of speaking about nothing. In contrast, for a category theorist such an abstract nonsense is a comfortable environment, the focus is to find a proper compromise between generality and applicability. In this respect papers-II existing today are not very successful: the majority of them are either too abstract to be useful in real DB problems or too fragmentary to state a proper foundation. A categorical idea which seems promising in this respect is that of Makkai's sketches, more exactly, their suitable algebraic generalization. In Makkai's presentation, generalized sketches is a framework for managing logic of diagram predicates, the focus is on their inference. In the DB language, this is nothing but derivation of constraints on data. However, the core of the DB technology is derivation of the very data, that is, in the CT language, the focus is in operations like pull-back, push-out etc. Of course, it is possible to consider, say, the pull-back property as a diagram predicate (as Makkai does) but a more natural and DB-useful view is to treat it as a diagram operation. Then complex generalized sketches which involve several marked diagrams must be considered as complex algebraic terms in the corresponding signature of diagram operations, and some parsing procedure is applicable. A notion of generalized sketch over a signature of diagram predicates and operations can be defined along these lines in parallel to the ordinary notion of logical theory over a signature of predicate and operation symbols. (It appears to be a truly graph-based logic in the spirit of Bagchi and Wells. A preliminary framework of basic definitions is described in the report "Databases as diagram algebras: specifying queries and views via the graph-based logic of sketches" (on ftp: //ftp.cs.chalmers.se/pub/users/diskin/TR9602/*.ps ). \bigskip I study DB for more than five years. It is a very lively, rich and mobile world which significantly influences the modern information technology and culture. However, in the mathematical perspective DB theory is seen as a small campus of the relational data model surrounded by a vast savage forest of various notational and terminological systems without any denotational semantics. In my wanderings in the forest, CT served as a reliable guide and, at last, I have composed the following CT-map of DB: 1) Database schemas (including ER-diagrams and the like) are generalized sketches in one or another signature; 2) Database states are sketch morphisms from schema sketches into topos-like model sketches built over domains of values and data objects; 3) Queries against databases are diagram operations, and query languages are monads over categories of generalized sketches; 4) Databases are Eilenberg-Moore algebras of these monads while views and refinements of database schemas are Kleisly morphisms. 5) Architecture schemas of distributed database systems are metasketches whose nodes are generalized sketches and arrows are their morphisms. I do not want to say that DB theory is an applied category theory: as any other advanced engineering discipline, DB is much richer than one or another formal theory used for modeling engineering concepts. Thus, it is more correct to read statements "A is B" above as "It is useful to view A as B" . Nevertheless, CT-formulation of DB-concepts turned out to be extremely useful in the theory and even practice of DB design. In a sense, relations between CT and software engineering remind those between calculus and mechanics engineering. Thanks for your attention, Zinovy Diskin