ࡱ>  '67&$`!{xC$f3W*xX PTvYؽWݕJ슶Q|:!i}ؤLd,.]vuHjSl|@+QԠJFAcF|#/,#]t;!ϮÎC8t%'{8rM\;z ꮠP=Ə5p&M =G0AXH'ũS"άz3`!!< ń|5z5IDb@`ׯ8ӁkA]^L{0OX8 p\"ZEȗ"\A:A@=vcqH8ظ кȩU^Lt^"zǸlZ"3,#3bNMF*hێCː +M"IߗjL+Q\\ f[c4ՉlM8\ a Ȑ[Ԇg#Y'i݂ݻw[VSd` V0#þy2lS zvTWWq:_^ķ m6h [vRB%d6<ؚ*8q"2ܒ,]y'6e]R+kOH>3ںU ._r1 l*Qܳhjj vq';w>Ev}7 pVQ"T^{n4R!SCí(Ț2Qv&R??X>hu?BU~twD;b?ZÃXU%z U%k)y@@^Jwb;+J(u huBH=Wp\}95rt&|>F dqQ]Cܽv)z휧NO3!:nHc$IrS,ƄŃԋBu$M&;&1j8S1 49ޠ>HF-=z0~j1k.id%J`~Llp +:qi3G׃"xk)eM&D8雊gJc{\<ܧ?Og_^89ƈN ة!>:zG"Ω9[z%sN[vӣwlp=(5F]l[I.-ӹs6x9gGcicJW3&x[ݖit9M?4MNxىS/8$I=S޵.m5bb1ŧ[yROI>IMZSĺ#7WыAn󐲇ٚ].]G|ϻfl-"m, f{Ѩ*9&5W@Ջb ~E~ѵj'] 9f[mRsZa'[ĻzucKe"s$|"7nx5 j:@>_LZf0L_5S5BȑMp _a,aqX;LÑ%jzI,0Q$,R_-誧]}q 8E]8خ9BzXg⪵Mnb35r37)̃BO q?kגE bJQ*f^V`$]^0 o"9&g8Z@o=c%akz^"MATW_l(9&`1;j HdmqLņ,0T3o6z |^m?Jl(c.-6̅ү7{($p,j0"Nw,cc`¥Q+@O'{]_7y/ QċFqo\E֊ӦIZ1q>Ll1>t?'%158,Yr+I+-&LJV(t]7RovX-:koodA`!&EA 0UٹMtva Ej.Dx|U?|9PD) )ARPAPnNRR$|k,u}sfv͞={D\$g"r2꟢&q-nXm|$>1DhJFAn( YIxPdx</_! )mS`uK6ś,fL/RE҈HY̥"$TKHZYdtn#]jk]q$K=dM28ڜSEӌ0 b0 a b 7Cp [ tt42Hsx%uTOzJI#晻P%Td\>r2*R]FڈGr-5܏W 6aSlhR0NLt|l'[Br|Oby{q#q/^Di{;RJ(&,1{a_\Az4\lWPMhvЀ.(HMhZ7'ZnN#$b xüLNjJ͠`"wk\.H+R,X.m#26#{SB挤3lV7 lcN6Qєp!v=`4ץ>,itR Kqؓwb'qrjEL/H'#s'`?$D.E ?W\2@}%/ylGUrUF"®dp]5WX|ZSb!ۙVڅՎ6*,]bѶKl8-+mB9)=y{$ԐKF6yrX/%M"7xWlH=m)nb-I-3tJs%#$gsIH+N>vI6DvrvaKavԱ3\E<^8-)Kk'O Fٿ{==\#UXE?:\NKi/[8yTc iC' 1d {Y6`$ƏxNO]$_Qz>tq+#&y…/flB.{ ]n'C7 ƺ`qa pKM엡K& wof >_Y38ŋ4|dAIi| ?h'#A20"_kܙ5.9wi̙'Exj;I` 9AT9ŹMOkqb'4b0CσOrm,5(7A*mWU5x R>kNpr@堠Rsp#L=n Ǚ2T&Τ$6fp /_K{/Y!әX$&4yLZZcΛkHRB(L% 3u4 2ү  OO8 Fa%[,0)M[\e 6W3xѭ.f T 60N{+:]siȍd=bR [ވ:RAB oɯN %\R*<)(J҆;[)8͒4^2;I~l-`,.2 I/:m g/g$7^xHrr)B ScKɉsW3W0^oxxV~ɉ/cyxd[AFYלe Q! s԰&'Ӄ$OӗK|0dNMŢHt P/j%}䍞\9%׎vEM4 eRtV>Van12,ڞtX nluD풤aqa{~)"K-^hH9[xS)K|eK*-G+~yđAnph{@]МUV7v^ W"N &NwFex:TӘ52B̗?U;\Ȗwbwr;f_iHs˨W`; m7:f.(Ms[iP6xlLe|⇴n,I;1#]TQ׮T0 i .rW}bh9-0GuEFk]~7yhc@}p gpEvOAp00W0)@'/Ywij y,2k k܅&64ٰɨLz<VOpʘlEĉ/_Ӟ*Rft4`Lc(V뽢vͦ|c=L]kk5\4%yM+fzA)_uEZ%"ǎvd5e":hDShz.Q-ԫ#xKglf:7c%8d!g.`u؅s]њG-Yp /[GK.jhLɃtՇsRb [ nhVoV⫵cZNk&rVӍwx)#LZb^|Z Z+q_=68BCs:4!먼9A:z|E/`y?xye'#8qp~<-Η#[x<}b/Xq_|&yWd;m䖂VJv_\G&\ Idw\w9W_9or!K,0K\Dqx?x-?qߤ6;:TjHgCS% {aJ f h9C+X4ܢi?$];2crVKQ>ytZ[%{|G8$Q37)-%eJ}_xN]ZAh)-qy(}Fs=Dh\\d׺_Gcs_d ӟ58f0AB) e yΕe.fvrG1{zؚW72W5=Ѹ%l+YgbH9h` 4 pC1|OҰH+Yie7ɬj0cq3MTpƽIf͑*FS?rĐ2B~ A{ Kp}ki^Iۄ|&ž#lizkvdqPSĜ=[` ;{/m rGVHƳ&xK(G՚rԓ*>3.~#& 1v=^Pߠ.}4V2hA|f6è>7c=Skk)Lsr!= *]*x;*(N9i@_uz!} Ԅjy@+(!  ~7NWo5:)̬f e6h_Bx̆CL_Y.hԄPOۆPU5}U;zľ.6|PHv)T0$&1T1Y!) L)Hk CVi_eM!|b" ćx:tZIeȟZC.=b]ͱHfiAHf@0+u\LUH9y l@zXmIF:,lrcYS+X: VyڼHo@ns^5y#aVsȄjvoXc55>q.P?t8COz$Pװ#/#/pkBŒrbnz KPYG5/mV=qy4wӗ>}4 7Q_\Imp6O)?tXcQnp6ܡ9§[v\5P<.c M(f c?}M͇7P2AI E{j.VT:PmB*mE=̦yA,|SHa } /Q=XNua=j*]<7T8LKamtJ;՟);y=b:Fhx`b^9yb>Oθ4hOp/uP=mU'+sdS0A#MUmJmrr%<$UZ 7ILhrgLi>l^}&7o5YyWWo*\A[>eoG^بLӄ'!ڴΦ155yߨo15('3ydg#>Esh~d6Oyg_ 'ܼ,皣So*QG&3i{ Q}mTG =Gt@g  9 }9 ]iŇLfOeݧ;+u_W\zf8B)g=3(?aͧUrúnyI ,sAZy:~ ԵTͩ5\p3Ke"]\{%=^4xCn&W+p c=q'S>ybv؆_ÆT47!9{yt$t\ҽٌl ȗ]k:s#yS\ r9h%,8/d*?B9%%-7pi\#qrOm@JJw)$fKz/bV~ψEiuȤ[y B[zrZAOhƌ$}m&wѝ(O(SJ|eyJb>dz<7j>~ʃ82 x~]b2JћSK>t{^sxO<=PN!_ y:ZYe95\Cx=t"s /*O,O5HW*4:ҔI_( 8`391'wVq3] =Ud;疳Iqr8i'we  qWIՕd<7=b#Zj؋W^+c!4LF5z3\YWȏ{D!y|{uUw϶h[S[geO=.87նZvBv,\R-b/s.38I@yyI5^!yϥΫUk:v/}nWx?\hO^G8=# ~&8}Jy*gP#;ڱA7];hMmoKsNli:io.mU_{]p?SO]J9p*lm.%Eߤ{Cv;<Ǯ.!Nv4n}ۑ=bc)n8pF3s\..xZo,3^iR-y-v5o[x#=ǟZn6ye-$V?l ydk3[_Ʋ֕%&m|WgL^CПvNǹ&Qb@cY*<+&y\sHEjHU_Re.4sx-\ޒdr\QAWX?AMYع&+w^טtU{M°% P0eW Sja1*m'ֹiM'/#v׷aE*6T,@a#tݦ.1sI{od1/M)–-NzX--ybw&<ʽCC\ ,> ?;_c\ Ww/y'=t+xx߸8E1uVsk#[}ܷAn&vq󱙛T&bezZaq# ҹUqCG05ٮ \qM!K àB% 'Cp6 Ap APp`GqdX c5 gcp1 Wbp9V b;ÖJW{EwR925P+\ Y{5 ҆#0a{8LF6Gmn7p%(B2mp! {%F{5.|w9'x%;.ipK]v{K]zb! ӹ%D/#uyTN U]n#G\=<zz=0e†X#U/p WagLnt NǼx,as ú?#fO!V_o!̇\Z1fҪ/V}|E2xȢ4j{= Zv Z bcE=-T֧:>>Vc*&{`oe?|)3eDA U7x r xM^œR_vL&aY)7L/0$RIs+yTsZIyiv~M*he)Yn"N*QbH UN2LsaAT+tYc{"'pK4ETIɧ4AzW҄> [QWy6TjK=|_*a]y =b ;MȰi$/!pANY9O&検4d-L`vh'KM12=b V̀;\T{.J* 2x]N yH:(i{ |"RQ|JU&n`Y5y?V߱sZLɳD *(TxP\@np:4Ζ<筦5r~M5좑3}i%,=Yhʏ}S#06cؐ~4ۨ}m&vJGs.<V54R;xLe);MIh**ӈ< #v,~Ɵi.}xV_h 2+մrOpA[2zNq`7r`,$ o0h[z|4r4xSwxuИ &X͂M\'U\&J:J *k~ҘVYiVqvsSxr7,gPK egR6eC]=e_E)MMrIb+Hvsc\He1myL]]#mFT),zN08&3228'r"8(״57O,ǂe[Ke|YF+e`0Sy'H`T J9dNe{<z3}%(p)gH45#!M{Im:H|Fg5 $Y%h6ɓ`N p~m.F*C:i9L[,4 Ii>\J_rq`X^j"Ӆ70Ӕ+xľmYfXw`yzK% u@;(%m+59QJ/!f5eY-3FxY;gZN%"%W {no!VyaN]sg}4Տ~#F$QK&5/}Ə{j &24"LIm3K3=? ii=bsãul*MqԔa5da3#f`K#bEzf*?A?^v<~,ֺG>[]5jj(ښC0i 5q)Mibjbw ?0]iMKjbRv0f`߉ux*XoA0 VO鎝L{Y'R8dåe\c:c b6A}/fN@..l  dñAMσu[0 w\оqH}QPڞ.sMf.czss Tr23y>69?1* 9>9󼌿GUY xMISzbo krp߀K8RJHH/A^:hl:@m,, ևfa]4 PPB6yľ5` p&L#W\'\Kq0907A/:c+hա>X]1Uq븗01,Se4?aNc1ƴ+XTǺz4ɊL* cTیzƆJSKU5SK+5u100b|@ s8Ѵ3ZgCfv[5n4 Ҷ}cӀF*PexľᗃfCga|;]iZF +uG`)x^w |\z\RI ;b_9-=w@>67{rz: Į*\\?J:e6Ru?7WZ>( 4)U!z (p@O9 e{[oFd:ztӾʣW$&<{8*d&aMft;L/N,T JES jǽ9xX")ѡ<;:WEh~Y-,+d^!]cӹOwQ4;׊ \,<˔>zGDГ{s{؞iSXo0M0>}>Ul^@?'QUj>:V9{؈sp?LaNI%/Ņ E:<&'$S9kKrqS1$2[idxƆ7SZtW䷣;H(dR d6W$IXy {Ã9$k›8|$3'2%/߅/\G(ZOFJE};SԟSq6$ C^θrZ|]a2}##|6PwZre mѺO0fqݠ@[i+q .ж{9'A?ӗ= .p%{t]m/*ާĮ.1zz]f.uq9˯M]V"s$tёxz8/DO r`. p` >`^ ^LY:)ĉ3rE+#)7Ma~ K9 |ޔዦ{j5?3#1$ 2(RVFJB&K.DRJ+xS=G,ꊘ)/;2Ҷ7Ҫ Ȏx:~¼)LI9b].IƒJ.h)/Çrԗ U%g)+#{%,Q]je&OL{ _͑4TRJϸ<咴sM5W?5J//SeIG$+WiIۥ/K=tSJj Ηq%r.'yt_) +5)bKR<棧,sxx]8=GJ;"by7K> ߪSԎTKeXYr2?J+/x58ľ6&n@FrYLY(Ґ$n!Ӹvvoˍ%/RRIp vT^SD-0{U'K2,QI3IK$#vR;zzgirfLSi2[RGt8NxYF'Z%hҙOr8Iœ%>NKSUlZ!sɣʞASa9gkM7K4_<ǿR|F<j, y2}(#\}s@Ōu}VS9OuH3HZs_iP? 羼S݅y'F)`#\Bڰz/}|RVm7rX͟ 3mX`"*I5q|j"&Dc28M'hBDw%@xBs0Slö= 7h18߼I8%x4op Ӽ~H-WOR)u/QFq (0X?4榩3w͗/.0|nMfjU`ΙX?.!cn3?[M,HhC`(,wҼ>f: T\5Pi8%X&Le$/a7b3`9vax|`41oa ?pR3u~is%MCEC)ZE*}@[7& E ("Q|PROԇVloM'k3̵NO\G!CTC4Cor,{gtij6tuh[~V#ƫt$f{e-;zg`JqTu kDЗb+ *DASFAA--2003 TutorialO =msManaging Semi/Unstructured Data (9Mukesh Mohania IBM India Research Lab mkmukesh@in.ibm.com6:6Outline Unstructured, XML and Semi-structured Data Techniques for storing XML/Semi-structured data XML Query Over Relational Data Streaming Data (semi-structured) Management Active Integration of Information Semantic Web Applications Content Manager Architecture H[ )$Unstructured Information On-line business information is unstructured -- mainly text. 80% of content is unstructured. Static content: word processor documents, html files, emails, text files, many more Dynamic content: extracted from underlying databases Anything on the web (static or dynamic) Properties of Data on Web Web data cannot be constrained by a type or schema. It has irregular structure and deeply nested. Its structure keeps evolving. Web data is very much distributed and linked. Data having such properties called semi-structured data. ^ 4 ^4  XML: eXtensible Markup Language   World Wide Web Consortium (W3C) standard to complement HTML HTML: Text + Presentation (no data) XML: Data + Structure (describes contents) Two modes Well formed XML: schema-less, semi-structured data, user-defined tags, self-describing data Valid XML: contains DTD for tags specification and grammar of the document, not completely schema-less Used for data exchange, transformation, and integration; bridge for data exchange on the web XML Standards: Schema (XML Schema), XSL, RDF, XPATH, Xquery and others<  XML Example   Tree for XML Data  Semi-structured Data Schema-less and self-describing, but the schema is attached to the data itself Schema is defined before/after the data, may not be enforced, schema may be extracted from data or from queries (like type inference in PL) Origins Integration of heterogeneous sources (Web + DB + & = ?) Data sources with non-rigid structure (biological data) Web data j0Z2}0Z2|:/Schema&   Semi-structured Data Model  Techniques for Storing XML  Why new storage techniques? To support the characteristics of XML data and queries Optional elements, repetition of tags, ordering, mixed contents (structured data embedded in large text fragments), etc. Document order and structure, full text search, transformation X77  7-Techniques for storing XML 0Store the entire document as a file in a file system or as a BLOB in a RDBMS (Flat streams) Fast store/retrieve whole documents or big continuous parts of documents Access the documents structure through parsing Using existing models Mapping from XML graph/tree into Relational, OO, LDAP directories Take advantages of Indexing, recovery, transactions, updates, query optimization, security, etc No support for mixed content XML document recovery is expensive! Introduces additional layers in DBMS, therefore slower Mixed (both files and relational tables)& but Redundant Native XML data model Logical data model is XML Physical storage features designed for XML\" ZyZ" ZZN" ZEZ\y  NE Mapping into Relational Model Edge Relation: Store all edges in one table and scalar values in another table Schema-driven Mapping from schema constructs to relational Fixed mapping from DTD to relational schema Flexible mapping from XML Schema to relational Universal Relation: Full outer join, but redundancy Captures node identity & document order Element reconstruction requires multiple joins Does not use DTD or XML schema ]ZZZA!v Edge Relation Example Schema Driven Mapping 0Repetition : separate tables Non-repeated sub-elements may be  inlined Optionality : nullable fields Choice : multiple tables or universal table Order : explicit ordinal value Mixed content ignored Element reconstruction may require multi-table joins because of normalization     % X>?    LDAP Example  8,Native XML Storage Verbatim files Appropriate for small documents, grep-style querying Natix (University of Mannheim, Germany) Hybrid: verbatim files + page-level storage Semantically partition large document into subtrees based on tree structure Store each subtree in one record (unit of storage) that is atomic Proxy nodes are used to connect subtrees in different records Primitives for read/write/insert/delete of element Record size need not be statically configured, can be a dynamic value; adapting to the size and structure of document at runtime Reconstruction of original tree by replacing proxies by subtrees Core of XML storage system No explicit use of DTDs or XML schema Xyleme uses Natix as underlying storage manager No query language support \Z5Z(ZzZ5(z/y$P  -:Commercial Databases IBM DB2 XML Extender Pure relational mapping Decomposition of XML and mapping into relational tables Mixed content CLOBs (Character Large Objects) + side tables for indexing structured data embedded in text Oracle 9i Canonical mapping into user-defined object-relational tables Stores XML documents in CLOBs MS SQL Server Generic Edge technique with inlined scalar values Text content modeled in CLOBs ZZ8ZZ\Z Z[ZZQZZZ8\   [O  Hs)&  XML Query Language: Requirements!! Expressive power Should support all relational algebraic operators Restructuring operations  reduction, merge, & Formal Semantics Important for dealing with query transformation and optimization Output delivery Mode The output of a query should be (at least) in the same language as the input Query Languages: Xquery, XML-QL, YATL, Lorel, WebSQLZaZZAZZMZ5ZaAM5>8 XML Query Over Relational Data Most web data will continue to be stored in relational databases (more than 90%) Need some way to execute XML query over relational data and then convert the results into XML data XPERANTO (IBM) allows existing relational data to be viewed and queried as XML. |QcRQc5 !Web Services Example "!XPERANTO; High Level Architecture"" #XQGM Intermediate representation : General enough to capture semantics of a powerful language such as XQuery Easy translation to SQL XQGM based on DB2 s QGM and XML Algebra XQGM consists of: Operators Functions (invoked inside operators) Functions capture manipulation of XML entities (elements, attributes, etc.) XML construction functions XML navigation functions b:0L4b:0L4`$ Data Stream  LA data stream is a sequence of data items X1, X2, & , Xn, coming continuously from single or multiple sources where random access to data is not allowed. Data Stream Characteristics Strongly regular: strongly periodic (inclusive zero time interval between two data items), only one type of data, schema can be derived or conforms schema. Weakly regular: weakly periodic (follows some time interval), mixed types of data but follows the order, schema can be derived. Irregular: aperiodic, types of data unknown, no order, schema cannot be derived. ZZoZZ+   c r G >%  DBMS vs. DSMS Traditional DBMS data stored in finite, persistent data sets assumes  one-time query against data focus on precise answer computed by stable query plans Data Stream Management System (DSMS) Allow some or all of the data being managed to come in the form of continuous, possibly very rapid, time varying, ordered data streams Queries may be continuous (not just one-time) Evaluated continuously as stream data arrives Answer updated over time Key ingredient in executing queries is Approximation Main memory computations DSMS = merely DBMS with enhanced support for triggers, temporal constructs, data rate management? hZZ%ZZGZZZf ^% F'    ;  4    D-  t 6| &!0Weakly Regular or Irregular Data Streams: Issues11DSchema discovery and evolution Filtering data interest to applications Unbounded memory requirements Materialization of Views Approximate Query Answering Techniques for data reduction and synopsis construction random sampling, histograms, sliding windows, etc Online processing Many data streams applications need online processing E.g., detecting denial-of-service attacks, detecting Service-Level Agreement violations, admission control and traffic policing, etc Offline processing is indeed appropriate for some applications E.g., capacity planning, determining pricing plans e826?4e81    6?4'"*Active functionalities over streaming data++ HProvides real-time functionalities that is needed in several advanced applications. Alert a doctor when the blood pressure of a patient goes below X, heart beats less than Y and ECG touches Z. Sell all my INTC stocks to the higher trading price if the price difference at any time between two exchanges is more than 2%. Cancel my tomorrow s flight if there is a terrorists attack in the region of flying. Events can be defined on composition of data streams that can trigger some pre-defined actions (notification and alert, database change, etc.) Context can be associated with the events INTC was trading higher at NASDAQ at 9:32 AM since CEO of INTC rang the opening bell.TTAVTAV,Event Based (Active) Information Integration-- On-demand integration Dissemination of selective information Tuned to change in business processes Autonomic computing Major shift in Industry  Architecture   Active Rules  Monitoring Events Many underlying operational systems do not have the capability of defining triggers or publish events. Sometimes the owner does not want the operations systems to be touched since they are executing thousands of transactions and no change, of whatsoever, is allowed in application or anywhere in these systems. The question is: how to monitor or sense the changes (change detection) in the operational systems which may trigger to flow the information across underlying systems for integrating them? :9ZZ:Polling Design a set of queries that are executed periodically. Compare the results of the same query with the previous materialized results of the same query. Find any change occurred in underlying operational system. If there is any change, determine whether the change is related to the registered event or not. Issues Materialization of previous results (up to what degree?) Not all changes can be monitored by querying Design of optimized queries for change detection Frequency of querying\5ZZZZ5 Semantic Web   Semantic Web  Semantic Web: Data + Metadata +URI & & . Metadata: Labeling and structuring information in a document URI (Universal Resource Identifier): an universal and unique name for any resource provides intelligent content Issues How to annotate documents? Building annotators for each vertical application? Design and evolution of rich ontology Categorize unstructured text Automatically create tags based on tags itself Personalization/Notifications/Alerts (=S    !(  4* Ontology &   An ontology is a specification of conceptualization. Standardizes meaning, description, representation of involved concepts/terms/attributes Captures the semantics involved via domain characteristics, resulting in semantic metadata  Ontological commitment forms basis for knowledge sharing and reuse Examples: WorldNet, Cyc, MeSH (Medical Subject Headings), Uncefact (product classification) Ontology Languages Ontology languages are semantic markup languages, DAML: DARPA Agent Markup Language OWL: Web Ontology Language is the successor of DAML + OIL (Ontology Inference Layer), currently developed by W3C web ontology group, and based on RDF ideas. Open Directory Project (ODP): Classification/Taxonomy & Directory (www.dmoz.org)ZZZQ0Z2#P@ f6+Ontology Definition The body of the ontology consists of Classes Properties Instances (for use in class definition) The main component of an ontology is a taxonomy (a class hierarchy)<%;D%;D(# Applications   Designing a scrap book on web Topic based  copy and paste of information in a logical order Finding relationships between documents Making your own web world Creation of a Web space abstraction Classification of documents Annotating these documents Report/History Generation Monitoring the changes Maintenance of web space abstractionT$$0'4Managing Unstructured Data: IBM Content Manager (CM)55 provides a formal mechanism for creation, maintenance and distribution of information (including unstructured content) within an enterprise supports version control, lifecycle management, searching and taxonomy (hierarchical classification of content) of documents efficient management of content and document routing capabilities (Workflow) supports variety of new data types for text documents, static images, video clips, audio files, and many more. B bbb1(Content: Issues 8Paper overwhelms the workspace No concurrent access; one user at a time Easy to lose or miss-file Security is poor Hard to find folder / document when needed Hard to find digital assets to reuse them Video and audio don't fit in a folder Workstation footprint not enough to hold large Video or voice files No Table Of Contents for folders Can't use automated search Costs to manage and distribute files PC files are stored in disparate servers, copies made and filed Documents not immediately available, leads to poor customer service Workflow means "pick up and move the folder" No cross enterprise folder of your entire customer relationship If it's not electronic, can't access over web - Can't do e-business Need ability to repurpose content (Web Publishing) Need Common infrastructure for ECM (Develop specific clients),9` 09c X2)High Level Architecture of CM  References dPhil Bohannon, Juliana Freire, Prasan Roy, Jrme Simon, From XML Schema to Relations: A cost-based Approach to XML Storage, ICDE 2002 Michael J. Carey,Jerry Kiernan, Jayavel Shanmugasundaram, Eugene J. Shekita, Subbu N. Subramanian, XPERANTO: Middleware for Publishing Object-Relational Data as XML Documents, VLDB 2000 Daniela Florescu, Donald Kossman, A Performance Evaluation of Alternative Mapping Schemes for Storing XML Data in a Relational Database, IEEE Data Eng. Bulletin 1999 P.J. Marron, G. Lausen, On Processing XML in LDAP, VLDB 2001 Carl-Christian Kanne, Guido Moerkotte, Efficient Storage of XML Data, Technical Report 8/99, University of Mannheim, 1999 Feng Tian, David J. DeWitt, Jianjun Chen, and Chun Zhang, The Design and Performance Evaluation of Various XML Storage Strategies, Technical report, University of Wisconsin W3C XML representation of a relational database In http://www.w3.org/XML/RDB. html W3C Recommendation. Extensible Markup Language (XML) 1.0 (Second Edition) In http://www.w3.org/TR/REC-xml Sihem Amer-Yahia, and Mary Fernandez, Techniques for Storing XML, ICDE tutorial, 2002. <eZ:BoK #e4qG$1&Lf  Y 6 T CI9.&References (contd& )  0Carl-Christian Kanne, Natix: A Native XML Base Management System, Ph.D. Thesis, University of Mannheim, Germany, 2002 A. Bonifati and S. Ceri, Comparative analysis of five XML query languages, SIGMOD Record, March 2000. Gregory Cohena, Serge Abiteboul and Amelie, Detecting Changes in XML Documents, ICDE 2002 Sourav Bhowmick, Sanjay Kumar Madria, Wee Keong Ng, Ee-Peng Lim, Detecting and Representing Relevant Web Deltas using Web Join, ICDCS 2000 B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, Models and Issues in Data Stream Systems, PODS 20021+N0,"N=E(  0(] V 0_6 ` ` ̙33` 333MMM` ff3333f` f` f` 3>?" dd@,|?" dd@   " @ ` n?" dd@   @@``PR    @ ` ` p>> f(    6p P  T Click to edit Master title style! !  0s   RClick to edit Master text styles Second level Third level Fourth level Fifth level!     S  0|x ``  >*  0} `   @*  0Ă `   @*H  0޽h ? ̙33 Default Design0 zr (    0G P   G P*    0|G    G R*  d  c $ ?  G  0G  @ G RClick to edit Master text styles Second level Third level Fourth level Fifth level!     S  6G `P  G P*    6G `  G R*  H  0޽h ? ̙33 0$(  r  S ԬP  r  S  p`  H  0޽h ? ̙33  @ $(   r  S L%p   r  S &`   H  0޽h ? ̙33  P$(  r  S 0   r  S    H  0޽h ? ̙33  `$(  r  S XK   r  S Kp  H  0޽h ? ̙33:  pz(  r  S P`     0< h Well formed XML <? XML VERSION= 1.0 STANDALONE= YES ?> <Here-is-my-tag> <another my-tag> & </> </> Valid XML <? XML VERSION= 1.0 ?> <!DOCTYPE BIBLIO [ <!ELEMENT BIBLIO (BOOK*, PAPER*)> <!ELEMENT BOOK (Author+, Year, Title)> <!ELEMENT PAPER (Author+, Year, Title, Source)> <!ELEMENT Author (#PCDATA)> & ]> B BH  0޽h ? ̙33f%  %%-6$2#(  $r $ S @Q    '$ Z ??@   WWidomc .$ <, @ 8$Ordered Elements (except attributes) 8 0  6$0 2 $ 3 r ??   Zbiblioc2 $ 3 ra ?? ?paperc2 $ 3 rd ??P P@> >bookc2 $ 3 rh ??@0 >bookc2 $ 3 rl ??`A P.  Dc 2  $ 3 r8o ??A .  Dc 2  $ 3 rr ??A .  Dc 2  $ 3 rv ?? A .  Dc 2  $ 3 ry ??p A ` .  Dc 2  $ 3 r| ??P A @.  Dc 2 $ 3 r ??0A .  Dc  $B  `??(  $  `??( s s $  `??( 8 $B  `??A  $B  `??A  $  `??A  $  `??A  $B  `?? > A  $  `?? > A  $  `?? >A  $  `??0S  $  fH ??   8& a  $  fȇ ??@   >authorc !$  fH ??@   >authorc "$  f ??p@   <Yearc #$  fđ ??@ K   =titlec $$  fx ?? @   >authorc %$  f ?? @   <yearc &$  fx ??@ ;  =titlec ($  fȝ ??i =  XUllmanc )$  f ??i =  <1994c *$  fX ??i G  HDatabase Systemsc +$  f$ ??i 7  HDatabase Systemsc ,$  f ?? y  <1980c -$  f ?? y  XUllmanc 0$ 60p 0  ; g 3$ BX@ s  ; g 4$ 64@ J  ; g 5$ Bpp 0  ; gH $ 0޽h ?`$$$$$$$$$$ $$$$$$ $$$ $$ $ $$ $ $$ $$$ $$ ̙33   $(  r  S E@`   r  S P  H  0޽h ? ̙33  (  r  S @   j  64@  The need for schema Optimize query processing Facilitate integration of multiple data sources Improve storage Construct indexes Describe contents of database to improve browsing and query formulation Forbid certain types of updates A Bad Example: As of April 1, 3 of 12 major banks of Japan (Dai-ichi Kangyo, Fuji and Industrial banks) were merged into World s biggest bank, called Mizuho Bank Ltd, & & database integration conflicts caused six days of chaos involving more than 30,000 transaction errors and more than 2.5 million delayed debits & .(ATM) transaction errors. SoI: Computerworld Inc. by Kuriko Miyake, IDG News Service, April 08, 2002. |0Z0 Z 0Z20Z2P(,H  0޽h ? ̙33&  r&j&/1 n$(   r  S @`   2  3 rx ??  =&o1c2  3 r ?? =&o4c2  3 r ?? P @ =&o3c2  3 r ??P p `   Dc 2  3 r ??P P @  Dc 2  3 rܻ ??P 0   Dc    `??( (   @  `??(    `??( s <   `??(    @  `?? P    `?? P    `?? P    `??0    fh ??   8& a   fT ??@   Xbiblioc   fl ??< <bookc   f ??9 0  <bookc   f$ ?? =paperc %  f ??p D  >authorc &  fH ??` s 4  <yearc '  f(  ??;  =titlec8 G  1 G 2   3 r ??@0 =&o2c2   3 r ??`P P  Dc 2   3 rh ??P   Dc 2   3 rl ??P   Dc 2   3 rH ?? P   Dc   B  `??P   B  `??P     `??P     `??P  !   f ?? >authorc "   f  ??0   >authorc #   fX# ??0   <yearc $   f& ??  =titlec (   fX* ??@   WWidomc )   f8. ??@   XUllmanc *   f<1 ??@   <1994c +   f4 ??@ G  HDatabase Systemsc ,  fX5 ??@ 7  HDatabase Systemsc -  f< ??p D  <1980c .  f  ??p D  XUllmanc / <B   &Unordered elements 0 <dF  NExample: Object Exchange ModelH  0޽h ?                                  ̙33  4:(  4r 4 S {    4 S 6   "p`PpH 4 0޽h ? ̙33  :(  r  S      S   "p`PpH  0޽h ? ̙33  8$(  8r 8 S @0p   r 8 S `0  H 8 0޽h ? ̙33'  j'b'><&(  <r < S Dp   r    < #"* r   E< <?   K&5  @` D< <?p   Nstring @` C< <? p  Nauthor @` B< <?   I1 @` A< <?   P&2 &  @` @< <<?   J&4 @` ?< <?p  Kref @` >< < ?p  Mpaper @` =< <?  I3 @` << <?  J&1 @` ;< <?   J&3 @` :< <?p  Kref @` 9< <?p Lbook @` 8< <8<? I2 @` 7< < <? J&1 @` 6< <l<?   J&2 @` 5< < <?p  Kref @` 4< <t#<?p Lbook @` 3< <(+<? I1 @` 2< <|3<? J&1 @` 1< <0;<? 0  J&1 @` 0< <C<?p0  Kref @` /< <`K<?0p hbiblio @` .< <R<?0 I1 @` -< <TU<?0 J&0 @` ,< <|c<?  0 NTarget @` +< <j<?p 0 LFlag @` *< <4m<?p0 KTag @` )< <u<?0 OOrdinal @` (< <<?0 NSource @``B F< 0o ? ZB G< s *1 ?0 0ZB H< s *1 ? ZB I< s *1 ? ZB J< s *1 ? ZB K< s *1 ? `B L< 0o ?  `B M< 0o ? ZB N< s *1 ? ZB O< s *1 ? ZB P< s *1 ?pp ZB Q< s *1 ?   `B R< 0o ?   f P  < #"r7r P  < < <? 0  hUllman @` < <\<?P0  J&6 @` < <@<? 0 gWidom @` < <t<?P 0 J&5 @` < <`<?  MValue @` < <D<?P  LNode @``B < 0o ?P ZB < s *1 ?P ZB < s *1 ?P0 0`B < 0o ?P `B < 0o ?P PZB < s *1 ? `B < 0o ?  < <\<z H  Edge table < << h   Value tableH < 0޽h ? ̙33  D$(  Dr D S <0  < r D S d< ` < H D 0޽h ? ̙33  2* !@.(  @r @ S <  < @8 PE1 @`U  @ C x4< 1?PA 1 XMLElement OC { SUBCLASS OF {XMLNode} MUST CONTAIN {order} MAY CONTAIN {value} TYPE order INTEGER TYPE value STRING }> lZfcc$ Z @ C xd< 1?@  [XMLAttribute OC { SUBCLASS OF {XMLNode} MUST CONTAIN {value} TYPE value STRING }>\ lZf[cc$ / @ 3 r 1?PE1 8  : !@ z2 @ 3 r@< ??   >Bookc 2 @ 3 r< ??=c "2 Dc 2  @ 3 r < ??ac F2 Dc 2  @ 3 rl< ??c 2 Dc 2  @ 3 r< ??c 2 Dc   @B  `?? xc   @B  `?? xc  @  `??x c  @  `??x %c  @  f< ??P o$ >authorc @  f\< ?? T >authorc @  f< ?? }T <yearc @  f> ?? U =titlec @  f> ??f: WWidomc @  f> ??f: XUllmanc @  f > ??fh: <1994c @  f > ??   HDatabase Systemsc~ @ 0>`:  F W ;  @ ` p J @ C xX> 1? ; 5  &oc:XMLElement oid:1 name:Book order: 1,'0 zP'c$g @ C x > 1?W L  1oc:XMLAttribute oid:1.1 name: Author value: Widom,20 zP2c6h @ C xp> 1?N0  2oc:XMLAttribute oid:1.2 name: Author value: Ullman,30 zP3c6B @B # lD1?z ^ | - B @ # lD1?5^ -   @ <#>P  ETailored to evolving Schema Captures node identity & document order RE lZfc(cH @ 0޽h ?O@ @ @@@ @@ @@@ @@ ̙33  $(  r  S +>  > r  S ,>p  > H  0޽h ? ̙33   H$(  Hr H S 0>p  > r H S 1>  > H H 0޽h ? ̙33  0d$(  dr d S 8>  > r d S 9>` > H d 0޽h ? ̙33  @$(  r  S H>  > r  S pI>0 > H  0޽h ? ̙33  P!P(  r  S |\>  > 8  `P @~  N8c?h) `_  c $d> ` t tXQuery over Catalog * bb  0J>jJ  ,Application Code Convert XQuery to SQL QueryL- bbbbB  ZD8c?@ P P  c $d10  S SQL Query.   bb~   N?    05 jJ t  @Internet   `B  ZD8c?0   0W jJ   /Application Code Convert Relational Data to XMLL0 bbbb `   #  @ ,$D0 0  # ` ,$D0~B  ND8c?0 ~B B ND8c?0  0  #  ` ,$D0~B  ND8c?0 ~B B ND8c?0   c $(f` m)Supplier provides an XML View of its Data(* )``B B ZD8c?{   c $o>` w  jXQuery. bbB  ZD8c? {    c $s>Z pw  q XQuery Result.  bbB  ZD8c?0  c $Dx>@ _ XQuery Result bB  ZD8c?` P  c $|>*  B SQL Result  b  <H>* &$J  Supplier  <> Buyerb  6> 0 2DBH  0޽h ? ̙33  `CC(  r  S ȋ>`  > F @ , @ @0B - HD3?#" h B .B HD3?#"  K B /B HD3?#" kB 0 HD3?#" @P Z 1 s *]J$ 2 <܃>2 o(  =RDBMS b< 3 #  4 <>4a   MXPERANTO Query Engine ` 5 <>5@F SXQuery  b  6 6>6K @  JComputation Pushdown ` 7 6,>78  W!Query Rewrite & View Composition" "` 8 6X>8_ ^XQuery Parser ` 9 <>9@ C SQL Query   b : <p>:@Y F Query Result   b ; <>;pdr <XQGM b < <<><Q+ E  <XQGM b` = 0 B >B HD3?#" ;PB ? HD3?#" 0@  @ <>@3K" ZTuples bB A HD3?#" @k pk  B 6>BO   ^Tagger Runtime ` C <>Cn { P R  ` Tagger Graph   bH  0޽h ? ̙33  p$(  r  S >`@`  > r  S >p > H  0޽h ? ̙33  $(  r  S >0  > r  S >  > H  0޽h ? ̙33  $(  r  S ?`  ? r  S l?p ? H  0޽h ? ̙33  $(  r  S ?  ? r  S p ?p ? H  0޽h ? ̙33  $(  r  S ?`0  ? r  S |? ? H  0޽h ? ̙33q  !`(  `r ` S !?p  ? r ` S \"?p  ?  ` <$? x Products: Crossworlds, WMQI, MQWF, BEA WebLogic Integrator Integrator, MS BizTalk, Web Methods Enterprise These products solve some aspects of event based integration of applications/data.>  nH ` 0޽h ? ̙33  (FP(  Pr P S .?0  ? 78 > FP ~ $P <>p4  < Data Sources b P <> `` 4DB P <>  p` :  P <h>P  > Data Stream    P B&  3Web  P < @ 5MDB" P <   : ZB P s *D   P 61?   7Adaptor P Bh5?I eC  7Monitor" P <d9?p p  : " P <:*;9>:+$.+] x!+] 6381$ 3-D^ D %D^0L8]TH+ YL^0L8]T7G@8Cn2H+IJI:B,= qR&N7#Q7JK J 7J>:8*;9+ +$ x!+ ] x!+$(,`C0*0*ITNT0*0* BCCloud"`0 + Business Logic/ProcessfB )P 6D fB *P 6D@ fB +P 6D  fB ,P 6D  fB -P 6Dp p fB .P 6D  fB /P 6D@ @ fB 0P 6D  lr ;PB <GHXI, p<> AP <]? 8Feedback  BP <`? FActive Functionalities` DP 0A?]ZB EPB s *D   H P 0޽h ?/0P;P ̙33  TE(  Tr T S f?   ?  T 0$h?P0o 3An active rule is composed of three components: Event (E): Monitor - Detect - Evaluate Condition (C): Derive - Analyze - Evaluate Action (A): Collaborate - Integrate - Effecth00  !H T 0޽h ? ̙33  X$(  Xr X S r?@  ? r X S hs? ? H X 0޽h ? ̙33  \$(  \r \ S w?   ? r \ S |?`p ? H \ 0޽h ? ̙33Z   t(  tr t S ?  ? | t 0?  Importance: Effective use of web information To make information context sensitive Derive new information or topic based history Support new services for e-business, e-gov etc.*  d t 0?  4 Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling Computers and people to work in cooperation. Source: Time Berners-Lee, James Hendler and Ora Lassila,  Semantic Web , Scientific American, May 2001 Semantics `meaning or relationship of meanings, or relating to meaning (Webster) is concerned with the relationship between the linguistic symbols and their meaning or real-world objects meaning and use of data (Information System). n0!0 >  :H t 0޽h ? ̙33   x$(  xr x S ?`  ? r x S ?@ ? H x 0޽h ? ̙33  0$(  r  S ?  ? r  S ? ? H  0޽h ? ̙33  @$(  r  S ?@  ? r  S T?`  ? H  0޽h ? ̙33  P$(  r  S ?  ? r  S x?p ? H  0޽h ? ̙33  `0(  x  c $?  ? x  c $?  ? H  0޽h ? ̙33  p0(  x  c $p?  ? x  c $,?P ? H  0޽h ? ̙33L  (  x  c $|?`  ?   ZA ?8c?"6@ NNN?N H  0޽h ? ̙33  L$(  Lr L S @?  ? r L S ?`P ? H L 0޽h ? ̙33  $(  r  S G  G r  S  Gp G H  0޽h ? ̙33|92x= xU=w1$!!,D 1Q\ID I@Hb2PDrjdYe]twQq7+x .Fe!*W=3&͛_W^oӎO@"e`Iˀ usMD;gn TNȆrVv tY?ʗn[<f=ko~6`<4y<sl{<21x~W_s-|wվ#pC_.+ըQWc%D ,c8~2U :ATC`xYaPsr0'$i>p%;eaK XTY)gzj”Pas'oQuq[aC9P^^Xs*̢Erըt)Ǒ7PW?.g[c)JT›kVu*nK0SK'.y'^fGOc#(7>ObpoQxSb{Ar.׃T4gd"žm/#V-qG#Lxaǎ Ձ6x9ǯ؎n`%_1 Ca <[OI\0?Nx(}MOc$8.DXH᭷i>^f>v|, s*hP{#H'0S[i~8|0H ||fat\Wlk -q_Ex+L=8z(H飧1:#}O`'-+_H| tCGo #( 0C4|#jcұNT#}ZJM*QZDrC FNDWd5o}テ.-n7N";E6zLm`RqZ"FJpef-/ѝ j;A'}q-,ۍmCaҢg_8x6١琪285*=CZjcGTlVH\薑FF"C5)^pZ m>Ȑq@ϐn\x珖Y/[^QN锪ªy꒪٥E%32hX&w}00^$c<ՍZSfxc<|q ߇B(Ju/g"61p SD q;jڬ\}5ҲRWiI#lw ]>x)z-uF׿VG΍#:j-%}_"LyBȶ믤uۀ%516_IUSK2jfc ),+-*tVW?57O{3{Ub޹5%Wk4ɬUYQ^Rv^72-~TV&0WFQPnY1^̨,+`ebK u`}X9)z]U[W?$z4XZXY/a֖y!`i JN'X  ~Y~vQ5]_@eQ/-M71,jJR,tTȡƱ8L[=ֺi$Om ;ID6^*{:ڮo%;/`i.vKQ^\2+%l97ou9z&`A +f/뭶a{B{Y n_/`禝.YUR*%Z9?+mυuKXo-00퐟υU^$\AOiOk =NS:m:5¶a+yzr=O?Ix_5B'Z:֨ݵNvFm {)}gvo:XEuX[ t>0l%UEӝW:>j<ԃZ62aHmnawFuڰ7r۱SBFG-gԭeOE[˯?mV!-4;4IyLj~ Ybp[{{mKup{-!`@wvЄ}҃ ʯ?MKMKf\V1hpOuSd};U:B n\B'@SX*cYaT!S'7|E!Bk#2CFw'F嵮R27|Pe=H%zHʈիzM,M]0НT"bX'v7pðwц P/È g8_XS,\Ւ>őf3\h -Zbw3Zv;70')r W%{4+?`wE!'SA!l7DtNgXYa>Xߝ;O/Y,g>+ b52`![N 1Q aI>,柽ag8GM.1Q7f,V1ĦMKFDf3'3B3/4a4,bxYx_)5 'rTMC Y4G#!

g&9C v"2 1eCs(ryIR#g w#`<{a)JQ,9$!#)- cLgC=x(gkAHb'>Mhwr%x-E|'ŭba&DQyiHF޽:M zVz噘,B9&Nx}ou?}:vnD".\rOA-X=$rw'lY+a7Pz F; ˫%US%\ 0*2{-UQW͔b,7D.aRdC,"T24}:A9,ITM®ؠp qh"xr|Z|(xkQ){6i2FZ~ĕy31'˃fԓHmg$_?#ڣu"dV"@Uw5)wfp_ͨ$jjs2wMmΒ8=uP( =\}B)0uy*Q Y0Чr؎,, T2Y6k+gޕwPU_ٹ $:R/P|Ss*j?Lp1Tn_>;,j&i!j0L 58Ut*DwWr:%##CnW.ɑvJ ClU H0"^/sqxq#S>xƯw*:E9QoXM#q[#ɾ-Ct=JB %zdފ1D(E,Dɥat2.\̸PG+o.w:޲gŽR^\ڟD8X-sp)V1DCO_']:(nR,{vKevep4S᪩vUU>I}}*QC:ݹpcLo0d+RpS:%Mo.Qkz }fAz[z!2Xzy{y{-DA>t#t12ʳ 253Ta,h1Kxɕ$> icKJ_ڽ2Eـ/IKY1ۙ_ )Yo+Q!=-R-S'8BG$ԚG4^,l-/smoٳa62+Ր.+8(\wQȮ@bMSr׹19.xj;aa([Ά E02Ո-ͿaBgÓ1hPT-d =(&śhaᇱs N›Bzi=SVS3 Ϳ,<\m=qJ q#\H*js^xO(|lhqA9 `KR2}TL ibS#cTڕHŴ[Z+sÑw.` Q2)Q< cTlԓ75,ߜ8{'g8dfr2WkKdʇub$#X J?>l]Uқ*) 6hMgu6__Nez %MH0mTT6W4eߣ+[*9?yډY fJ^(mj'u`1V7,O\:yۯmos^H1Z'TdmOd&Y2,x(z-qE9_NɷM5s4k\g3Y)r þV׷U$Z+i|A-, ѕh?OKq_)뵫k-/ZHkB'ktK>e{[#\N; a~=c8.e5w)uxR qCeP잚KVySk4tJr,9ԋY;9K굹`/Z\*ic55* )o7ǣˋR6v눞pyFVHk-#Eg`IUIu]JWBZzJK-@v _{8{J+8X;ߦ? ?uFSt'u)pښ秿#jQ= Oao]Z?RCio_6+JW$X~yTs9{YQeE>6 (i?kZ:PYoNAۋȒk~߾.?ij-ڳЋ tV[jz@ :o[ڃ23+E A&nWߺm g^Kׇf.]i$w9vY^gzY;Ŷ,]z(O)!P=,T6ATw)1_nN`r/f%#sH Q `Ⳡa{8 "i{HG|\(q,OySˏL^pc,e3vÙV\ϖczSp?Q,!CJo"0ZN#ژu:Fjrs *"%Dq\  76oA1RB0B?iseZM&Ҹ?fm F<E9A[1Xk&=c&%:8vQ!k i Cd1U$ѽtLdKw:>kd9L7 gy&(wY3?m|I);Ly laʚllU}l.efWX]b623yr'g mQm)}bu)PϔTH}KyߑoulitT%pT ClI|Wo!&[!E>s NȓXDG2kWU~8iF Bڢ.?S{3>TchA|q87bv|C|/?  ? B%>_.?' LB,|Y KIdիz[|&@e=Pf8B)*tYcaqb9ڲ32x.\ 'h$5@[wE5j{k Q("Ӡ-DF~ǃ.mޓhǶi'Tv;}J.R%n-Q[}d9:ˏDo@Y6nu$)}yAe +~~pC E@9xV+aaF#fcz\=m,H-T Md>i&q KaRëwd6H杩vp\D}\t$aข=ߍW\Cڝ=1nO1^1ђwƲёp\ż HCagH$ 5z!⍎L-,oKntd#VI cX\1& RDZ|I$5?8nDF<;RL;SC1&[secLZb̤XlݺF>mciF0rUF]ddpLA"g퍆 u֛QCyۭw55 TPoQ W{*@y/y^aac.L8 w,Y(9 xZoI7KiHð+aVAv^c50鸵v,U;ҖrX XGI1628H v`RvQX+.K&%V\y.D>dPj,FөanƈKjmމB~8g<8$Qö捎&BjT #>v 8ĘMǗJjQ݊9F@M8?oNw W^'M6w),Vs(<ޭFQlOm~eLqעCCxx F)^H[^ ov?'epJ{voj^wtꭏk͎RwFwEET{W γ!\oA٦^aDh4TU's>:ԎΣ61KL4:N6V+annS:XG~>rACڔ`V܄~˰uX-߹(_Wݴ@^}P};_;?ZಭӺؕ V/RC(14`/xZb5%~?GG7qt& 5w,xiK5D6En3oM3JjSZHQy#"?oN?oU n:A9np\&;qT C>U4={Hq hrF"v:Y5ӑP49٪OPd(7ud]U%(.)wJ+ݴ);'A/msx{IA{/`[\u8exf7sPsxKYY^3Yas|ҳmw{ o>UKۧLModE1p֝jlߺuF <^y8zcc4Rjط兀 -5=6,boɒitXs.J͹0Χqyז21rh[nDPqy)+΅ ƸR&ϖ-78%X[nBu q2}1`X/3Y"?(۪ePYbn!/ew)nsX[:a_>Zo#}?c{,.q旔WJg]I /5ik5t:m -5d Ł- JmԴ=ad@{BalOpth=fZ[8UXnOX$:=z;: :9c 0@ MS[fOk=!M `G( 蕞݀@`nWt4vwW"lϝnͳ8ON{T 9ÈӜiJ.dK Ή-դ-kgh,qbӐm;A2s{ӶL̫Tq+W6lRz |VG:<ǮYUR*qVS ˧%n6Άjˣ-*: -{O3t4;%2X6 鮯}Y#F>x}k}Kl}{ZGi tNy2c?c9CtPv Ǧeo~kzcm];wm X\{q Z8,|u%*lG}'n30XcK.G2b?>NPj*x+X4Tr(06csÜJ%tc87"4DCh@|'#>fEl'(.bQ a)UPNӊjOLVS@<0l'Q"=!:x,S>S^ȵILQ6x LB<w-¦4]YƟVGO֠UܥQ]B{æθߤc馽GȘӠ*|ODyL0{S朴vm.@Egmֽ }xcj/02>^m; q#.!?Ȝ `:5]Bm!?""noN:p+Uǟ#0WM $}l}.vhp3Kgq6`"a&< T50e~1${ 0t1r5a nJFΐ " BG(Õ[|# IL4{`~}r1bƸCs x75@c-[‚G#E5ĬUkaE&X1EF3{w{a/a?B{CO±O_z~zIW7wwG}^> d w@z:@ W!O\P3>`1*b VC2N%BЕ0 =i֫Pp .fBch@ aD(+k o'[H?Blu rkyy9C=52]$FZ3".fTIX IڄsM>-t::wpSXz5\sw7Kxiq. Ng]p4^Ly{KׯyE|P''=x=?'W)u HI4Z'H;v쀺: n"A›;|pUڞ 4CClF kT=+A/& 'ObR2L{1J2 WI[o}"3Hpܱ>V0cC a:Az4HOI%һ$x0毿zNJ0 ԀQo^$x=z(N?`|/_ٷχ4Ƹo$xG~ ޕ1}S:MHǏx c?UIp$BSEL*u/\D)ISK_A 8hp[gg]_:=J+ NR=_be> 209-)g~76=5E:P+¿zzR 'S -?~ 7ȟɩU,PS>GVD/Vyh-+Y2V^C[X韔ܡ1I:30sU߱3f%'v n^wU6<ʬެ;YiXt!ެO9k.|CIjQtZ9̻VNM=okՑvS!Y%)^<Ɲ2 fN)/vEiL0pVIqk^Ybʌ"Q/ ;)⪊JdL4dT*fuΜy5.g?c,Oǧ@kO1 v `CN+-K qXDR3ZM6 Lb36Ts\V˴? ]QL.)Kzzh=0 =%鰅8!ME>/E" Cg Xp JiF)_U4f˝h뀖d-hi}}ii[wK=XƁhW/`>Wo'xĎ/ ٸ cӒD9_\*/@ƒY?Y[G[5&ʔGp5= y\ƶL;1lY5]ϖiLa0zȨðɷ/3h->*Af]6ۢ!,Cq4:u] -L29H AcytNZg'Hj@tgu? 8R=wvY_"yK"W H΋ tj? _YpuT7l:~ǚzKq1f7Xf)oeV"x55lT-ZSj9Lp).gʒ%/OQWE_NPxV`#~Ɨ+-۳Bvyكw8珣 ~9@ md$A+ ~ !`e*.WEpC,/ֽCp% No=rxЂCc[t,?FHrR7 8fZ0"G."ۻO y #k1`Y2!α'btOqij,$ ^ 71ZJMU>$a8 YKn>Y *ѣ#=98_vot;8[aq;|AH{CG7Qe߃`EH/s: !|M #|0s|z_s8_zgvz_uYӃVKԳF/uܺEr0l}ҋP2ęj~=$u{r?Ζ0ACWhjlnp/@بЪ46`;. ;ut}( ;   /WingdingsMonotype SortsSymbolArial Unicode MS LotusWP TypeDefault DesignLotus Freelance 9 Drawing Managing Semi/Unstructured DataOutlineUnstructured Information XML: eXtensible Markup Language XML ExampleTree for XML DataSemi-structured DataSchemaSemi-structured Data ModelTechniques for Storing XMLTechniques for storing XMLMapping into Relational ModelEdge Relation ExampleSchema Driven Mapping LDAP ExampleNative XML StorageCommercial Databases!XML Query Language: RequirementsXML Query Over Relational DataWeb Services Example"XPERANTO; High Level ArchitectureXQGM Data StreamDBMS vs. DSMS1Weakly Regular or Irregular Data Streams: Issues+Active functionalities over streaming data-Event Based (Active) Information Integration Architecture Active RulesMonitoring EventsPolling Semantic Web Semantic Web Ontology Ontology DefinitRoot EntrydO)/ҌPicturesNCurrent User=8SummaryInformation(LRoot EntrydO) )PicturesNCurrent User=SSummaryInformation(L     ;%:<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy{|}~  !#$%&'()*+,-./0123456789:;<>ion Applications5Managing Unstructured Data: IBM Content Manager (CM)Content: IssuesHigh Level Architecture of CM ReferencesReferences (contd)  Fonts Used Design TemplateEmbedded OLE Servers Slide Titles))_Toshiyuki AMAGASAToshiyuki AMAGASA.(2 N=IBM India Research Lab     .     ;(:<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy{|}~  !#$%&'()*+,-./0123456789:;<ion Applications5Managing Unstructured Data: IBM Content Manager (CM)Content: IssuesHigh Level Architecture of CM ReferencesReferences (contd)  Fonts Used Design TemplateEmbedded OLE Servers Slide Titles) _9bluechrybluechryiyuki AMAGASA.(2 N=IBM India Research Lab     . CM ReferencesReferences (contd)  Fonts Used Design TemplateEmbedded OLE Servers Slide Titles) _\IBM_UserIBM_User  !"#$%&')*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~     ; :<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy{|}~Root EntrydO)tbPicturesNCurrent User=8SummaryInformation(LPowerPoint Document((DocumentSummaryInformation8"@Drawing FLW3Drawing02Lotus Freelance 9 Drawingv/ 0DTimes New Roman4hPv 0h( 0DArialNew Roman4hPv 0h( 0" DTahomaew Roman4hPv 0h( 0"0DArial Narrowan4hPv 0h( 0"@DWingdingsowan4hPv 0h( 0PDMonotype Sorts4hPv 0h( 0`DSymbole Sorts4hPv 0h( 0pDArial Unicode MShPv 0h( 0"DLotusWP Type MShPv 0h( 0 ` .  @n?" dd@  @@`` :+  37 "G# "D    /X2${xC$f32$A 0UٹMtv.Ec $0 @Ouʚ;2Nʚ;g4GdGdv 0\:ppp@ <4!d!d` 05<4dddd` 05 <4BdBd 0___PPT9-`(?>+ *DASFAA--2003 TutorialO =sManaging Semi/Unstructured Data (9Mukesh Mohania IBM India Research Lab mkmukesh@in.ibm.com6:6Outline Unstructured, XML and Semi-structured Data Techniques for storing XML/Semi-structured data XML Query Over Relational Data Streaming Data (semi-structured) Management Active Integration of Information Semantic Web Applications Content Manager Architecture H[ )$Unstructured Information On-line business information is unstructured -- mainly text. 80% of content is unstructured. Static content: word processor documents, html files, emails, text files, many more Dynamic content: extracted from underlying databases Anything on the web (static or dynamic) Properties of Data on Web Web data cannot be constrained by a type or schema. It has irregular structure and deeply nested. Its structure keeps evolving. Web data is very much distributed and linked. Data having such properties called semi-structured data. ^ 4 ^4  XML: eXtensible Markup Language   World Wide Web Consortium (W3C) standard to complement HTML HTML: Text + Presentation (no data) XML: Data + Structure (describes contents) Two modes Well formed XML: schema-less, semi-structured data, user-defined tags, self-describing data Valid XML: contains DTD for tags specification and grammar of the document, not completely schema-less Used for data exchange, transformation, and integration; bridge for data exchange on the web XML Standards: Schema (XML Schema), XSL, RDF, XPATH, Xquery and others<  XML Example   Tree for XML Data  Semi-structured Data Schema-less and self-describing, but the schema is attached to the data itself Schema is defined before/after the data, may not be enforced, schema may be extracted from data or from queries (like type inference in PL) Origins Integration of heterogeneous sources (Web + DB + & = ?) Data sources with non-rigid structure (biological data) Web data j0Z2}0Z2|:/Schema&   Semi-structured Data Model  Techniques for Storing XML  Why new storage techniques? To support the characteristics of XML data and queries Optional elements, repetition of tags, ordering, mixed contents (structured data embedded in large text fragments), etc. Document order and structure, full text search, transformation X77  7-Techniques for storing XML 0Store the entire document as a file in a file system or as a BLOB in a RDBMS (Flat streams) Fast store/retrieve whole documents or big continuous parts of documents Access the documents structure through parsing Using existing models Mapping from XML graph/tree into Relational, OO, LDAP directories Take advantages of Indexing, recovery, transactions, updates, query optimization, security, etc No support for mixed content XML document recovery is expensive! Introduces additional layers in DBMS, therefore slower Mixed (both files and relational tables)& but Redundant Native XML data model Logical data model is XML Physical storage features designed for XML\" ZyZ" ZZN" ZEZ\y  NE Mapping into Relational Model Edge Relation: Store all edges in one table and scalar values in another table Schema-driven Mapping from schema constructs to relational Fixed mapping from DTD to relational schema Flexible mapping from XML Schema to relational Universal Relation: Full outer join, but redundancy Captures node identity & document order Element reconstruction requires multiple joins Does not use DTD or XML schema ]ZZZA!v Edge Relation Example Schema Driven Mapping 0Repetition : separate tables Non-repeated sub-elements may be  inlined Optionality : nullable fields Choice : multiple tables or universal table Order : explicit ordinal value Mixed content ignored Element reconstruction may require multi-table joins because of normalization     % X>?    LDAP Example  8,Native XML Storage Verbatim files Appropriate for small documents, grep-style querying Natix (University of Mannheim, Germany) Hybrid: verbatim files + page-level storage Semantically partition large document into subtrees based on tree structure Store each subtree in one record (unit of storage) that is atomic Proxy nodes are used to connect subtrees in different records Primitives for read/write/insert/delete of element Record size need not be statically configured, can be a dynamic value; adapting to the size and structure of document at runtime Reconstruction of original tree by replacing proxies by subtrees Core of XML storage system No explicit use of DTDs or XML schema Xyleme uses Natix as underlying storage manager No query language support \Z5Z(ZzZ5(z/y$P  -:Commercial Databases IBM DB2 XML Extender Pure relational mapping Decomposition of XML and mapping into relational tables Mixed content CLOBs (Character Large Objects) + side tables for indexing structured data embedded in text Oracle 9i Canonical mapping into user-defined object-relational tables Stores XML documents in CLOBs MS SQL Server Generic Edge technique with inlined scalar values Text content modeled in CLOBs ZZ8ZZ\Z Z[ZZQZZZ8\   [O  Hs)&  XML Query Language: Requirements!! Expressive power Should support all relational algebraic operators Restructuring operations  reduction, merge, & Formal Semantics Important for dealing with query transformation and optimization Output delivery Mode The output of a query should be (at least) in the same language as the input Query Languages: Xquery, XML-QL, YATL, Lorel, WebSQLZaZZAZZMZ5ZaAM5>8 XML Query Over Relational Data Most web data will continue to be stored in relational databases (more than 90%) Need some way to execute XML query over relational data and then convert the results into XML data XPERANTO (IBM) allows existing relational data to be viewed and queried as XML. |QcRQc5 !Web Services Example "!XPERANTO; High Level Architecture"" #XQGM Intermediate representation : General enough to capture semantics of a powerful language such as XQuery Easy translation to SQL XQGM based on DB2 s QGM and XML Algebra XQGM consists of: Operators Functions (invoked inside operators) Functions capture manipulation of XML entities (elements, attributes, etc.) XML construction functions XML navigation functions b:0L4 b:0L4`$ Data Stream  LA data stream is a sequence of data items X1, X2, & , Xn, coming continuously from single or multiple sources where random access to data is not allowed. Data Stream Characteristics Strongly regular: strongly periodic (inclusive zero time interval between two data items), only one type of data, schema can be derived or conforms schema. Weakly regular: weakly periodic (follows some time interval), mixed types of data but follows the order, schema can be derived. Irregular: aperiodic, types of data unknown, no order, schema cannot be derived. ZZoZZ+   c r G >%  DBMS vs. DSMS Traditional DBMS data stored in finite, persistent data sets assumes  one-time query against data focus on precise answer computed by stable query plans Data Stream Management System (DSMS) Allow some or all of the data being managed to come in the form of continuous, possibly very rapid, time varying, ordered data streams Queries may be continuous (not just one-time) Evaluated continuously as stream data arrives Answer updated over time Key ingredient in executing queries is Approximation Main memory computations DSMS = merely DBMS with enhanced support for triggers, temporal constructs, data rate management? fZZ%ZZGZZZf ^% F'    ;  4    D-  t 6| &!0Weakly Regular or Irregular Data Streams: Issues11DSchema discovery and evolution Filtering data interest to applications Unbounded memory requirements Materialization of Views Approximate Query Answering Techniques for data reduction and synopsis construction random sampling, histograms, sliding windows, etc Online processing Many data streams applications need online processing E.g., detecting denial-of-service attacks, detecting Service-Level Agreement violations, admission control and traffic policing, etc Offline processing is indeed appropriate for some applications E.g., capacity planning, determining pricing plans e826?4e81    6?4'"*Active functionalities over streaming data++ ZProvides real-time functionalities that is needed in several advanced applications. Alert a doctor when the blood pressure of a patient goes below X, heart beats less than Y and ECG touches Z. Sell all my INTC stocks at the higher trading price exchange if the price difference at any time between two exchanges is more than 2%. Cancel my tomorrow s flight if there is a terrorists attack in the region of flying. Events can be defined on composition of data streams that can trigger some pre-defined actions (notification and alert, database change, etc.) Context can be associated with the events INTC was trading higher at NASDAQ at 9:32 AM since CEO of INTC rang the opening bell.TTJVTJV,Event Based (Active) Information Integration-- On-demand integration Dissemination of selective information Tuned to change in business processes Autonomic computing Major shift in Industry  Architecture   Active Rules  Monitoring Events Many underlying operational systems do not have the capability of defining triggers or publish events. Sometimes the owner does not want the operations systems to be touched since they are executing thousands of transactions and no change, of whatsoever, is allowed in application or anywhere in these systems. The question is: how to monitor or sense the changes (change detection) in the operational systems which may trigger to flow the information across underlying systems for integrating them? :9ZZ:Polling Design a set of queries that are executed periodically. Compare the results of the same query with the previous materialized results of the same query. Find any change occurred in underlying operational system. If there is any change, determine whether the change is related to the registered event or not. Issues Materialization of previous results (up to what degree?) Not all changes can be monitored by querying Design of optimized queries for change detection Frequency of querying\5ZZZZ5 Semantic Web   Semantic Web  Semantic Web: Data + Metadata +URI & & . Metadata: Labeling and structuring information in a document URI (Universal Resource Identifier): an universal and unique name for any resource provides intelligent content Issues How to annotate documents? Building annotators for each vertical application? Design and evolution of rich ontology Categorize unstructured text Automatically create tags based on tags itself Personalization/Notifications/Alerts (=S    !(  4* Ontology &   An ontology is a specification of conceptualization. Standardizes meaning, description, representation of involved concepts/terms/attributes Captures the semantics involved via domain characteristics, resulting in semantic metadata  Ontological commitment forms basis for knowledge sharing and reuse Examples: WorldNet, Cyc, MeSH (Medical Subject Headings), Uncefact (product classification) Ontology Languages Ontology languages are semantic markup languages, DAML: DARPA Agent Markup Language OWL: Web Ontology Language is the successor of DAML + OIL (Ontology Inference Layer), currently developed by W3C web ontology group, and based on RDF ideas. Open Directory Project (ODP): Classification/Taxonomy & Directory (www.dmoz.org)ZZZQ0Z2#P@ f6+Ontology Definition The body of the ontology consists of Classes Properties Instances (for use in class definition) The main component of an ontology is a taxonomy (a class hierarchy)<%;D%;D(# Applications   Designing a scrap book on web Topic based  copy and paste of information in a logical order Finding relationships between documents Making your own web world Creation of a Web space abstraction Classification of documents Annotating these documents Report/History Generation Monitoring the changes Maintenance of web space abstractionT$$0'4Managing Unstructured Data: IBM Content Manager (CM)55 provides a formal mechanism for creation, maintenance and distribution of information (including unstructured content) within an enterprise supports version control, lifecycle management, searching and taxonomy (hierarchical classification of content) of documents efficient management of content and document routing capabilities (Workflow) supports variety of new data types for text documents, static images, video clips, audio files, and many more. B bbb1(Content: Issues 8Paper overwhelms the workspace No concurrent access; one user at a time Easy to lose or miss-file Security is poor Hard to find folder / document when needed Hard to find digital assets to reuse them Video and audio don't fit in a folder Workstation footprint not enough to hold large Video or voice files No Table Of Contents for folders Can't use automated search Costs to manage and distribute files PC files are stored in disparate servers, copies made and filed Documents not immediately available, leads to poor customer service Workflow means "pick up and move the folder" No cross enterprise folder of your entire customer relationship If it's not electronic, can't access over web - Can't do e-business Need ability to repurpose content (Web Publishing) Need Common infrastructure for ECM (Develop specific clients),9` 09c X2)High Level Architecture of CM  References dPhil Bohannon, Juliana Freire, Prasan Roy, Jrme Simon, From XML Schema to Relations: A cost-based Approach to XML Storage, ICDE 2002 Michael J. Carey,Jerry Kiernan, Jayavel Shanmugasundaram, Eugene J. Shekita, Subbu N. Subramanian, XPERANTO: Middleware for Publishing Object-Relational Data as XML Documents, VLDB 2000 Daniela Florescu, Donald Kossman, A Performance Evaluation of Alternative Mapping Schemes for Storing XML Data in a Relational Database, IEEE Data Eng. Bulletin 1999 P.J. Marron, G. Lausen, On Processing XML in LDAP, VLDB 2001 Carl-Christian Kanne, Guido Moerkotte, Efficient Storage of XML Data, Technical Report 8/99, University of Mannheim, 1999 Feng Tian, David J. DeWitt, Jianjun Chen, and Chun Zhang, The Design and Performance Evaluation of Various XML Storage Strategies, Technical report, University of Wisconsin W3C XML representation of a relational database In http://www.w3.org/XML/RDB. html W3C Recommendation. Extensible Markup Language (XML) 1.0 (Second Edition) In http://www.w3.org/TR/REC-xml Sihem Amer-Yahia, and Mary Fernandez, Techniques for Storing XML, ICDE tutorial, 2002. <eZ:BoK #e4qG$1&Lf  Y 6 T CI9.&References (contd& )  0Carl-Christian Kanne, Natix: A Native XML Base Management System, Ph.D. Thesis, University of Mannheim, Germany, 2002 A. Bonifati and S. Ceri, Comparative analysis of five XML query languages, SIGMOD Record, March 2000. Gregory Cohena, Serge Abiteboul and Amelie, Detecting Changes in XML Documents, ICDE 2002 Sourav Bhowmick, Sanjay Kumar Madria, Wee Keong Ng, Ee-Peng Lim, Detecting and Representing Relevant Web Deltas using Web Join, ICDCS 2000 B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, Models and Issues in Data Stream Systems, PODS 20021+N0,"N=E(  0(] V 0_6'  ''02 %(   r  S ,?@`   x#8 0  2 0 2   3 r= ??   =&o1c2   3 r0 ?? =&o4c2   3 rg ??P @ =&o3c2   3 r ??p P `   Dc 2   3 rh ??P P @  Dc 2   3 r ??0P   Dc     `??( (   B  `??(     `??( s <    `??(    B  `??  P     `??  P     `?? P     `??0     f ??   8& a    ftc ??@   Xbiblioc    f ??< <bookc    f ?? 90  <bookc    ft ?? =paperc %   f( ?? p D  >authorc &   f ??s ` 4  <yearc '   f ??;  =titlec@ G  1 G 2   3 r ??@0 =&o2c2   3 rp ??`P P  Dc 2   3 r& ??P   Dc 2   3 r<) ??P   Dc 2   3 r, ?? P   Dc   B  `??P   B  `??P     `??P     `??P  !   f/ ?? >authorc "   fd3 ??0   >authorc #   f6 ??0   <yearc $   f9 ??  =titlec (   f(> ??@   WWidomc )   f? ??@   XUllmanc *   fTC ??@   <1994c +   fF ??@ G  HDatabase Systemsc ,   fG ??@ 7  HDatabase Systemsc -   f,J ?? p D  <1980c .   fQ ?? p D  XUllmanc / <U   &Unordered elements 0 <HX  NExample: Object Exchange ModelH  0޽h ?                                  ̙33K  K~Kj<I(  <r < S vP@   r   < #"* r  E< <q?ph  K&5  @` D< <?h p  Nstring @` C< <?h   Nauthor @` B< <D?h   I1 @` A< <Ԣ?h   P&2 &  @` @< <?p h  J&4 @` ?< <d?ph  Kref @` >< <P?h  Mpaper @` =< <8?h  I3 @` << <H?h  J&1 @` ;< <X?p  J&3 @` :< <h?p Kref @` 9< <x? Lbook @` 8< <? I2 @` 7< <? J&1 @` 6< <?p  J&2 @` 5< <?p Kref @` 4< < ? Lbook @` 3< <l? I1 @` 2< <? J&1 @` 1< <#?p  J&1 @` 0< <%?p Kref @` /< <4? hbiblio @` .< <6? I1 @` -< <XD? J&0 @` ,< <hG?p  NTarget @` +< <|T?p LFlag @` *< <W? KTag @` )< <d? OOrdinal @` (< <g? NSource @``B F< 0o ? ZB G< s *1 ? ZB H< s *1 ? ZB I< s *1 ? ZB J< s *1 ? ZB K< s *1 ?h h `B L< 0o ? `B M< 0o ? ZB N< s *1 ? ZB O< s *1 ? ZB P< s *1 ? ZB Q< s *1 ?pp `B R< 0o ?  f P  < #"r7r @& < <w? 0  hUllman @` < <y?P0  J&6 @` < <H? 0 gWidom @` < <|?P 0 J&5 @` < <h?  MValue @` < <L?P  LNode @``B < 0o ?P ZB < s *1 ?P ZB < s *1 ?P0 0`B < 0o ?P `B < 0o ?P PZB < s *1 ? `B < 0o ?  < <d   Edge table < 0@`  Value table"F 0  <  2 < # l ??   =&o1c2 < # l ?? =&o4c2 < # l ??P @ =&o3c2 < # l\ ??p P `   Dc 2 < # lp ??P P @  Dc 2 < # l) ??0P   Dc  < Z??( (  <B Z??(  < Z??( s < < Z??(   <B Z??  P  < Z??  P  < Z?? P  < Z??0  <  ` ?? H  8& a <  `8 ??@   Xbiblioc <  `$ ??  <bookc <  ` ?? 8  <bookc <  ` ??y =paperc <  ` ?? p 7  >authorc <  `` ??f c )  <yearc <  ` ??  =titlecoN   <  2 < # l ??@0 =&o2c2 < # l ??`P P  Dc 2 < # l ??P   Dc 2 < # l| ??P   Dc 2 < # l0  ?? P   Dc  <B Z??P  <B Z??P  < Z??P  < Z??P  <  `4 ??v >authorc <  ` ??2 v  >authorc <  `X ??2   <yearc <  ` ??g  =titlec <  ` ??B   WWidomc <  ` ??B   XUllmanc <  `  ??B m  <1994c <  `l# ??B  HDatabase Systemsc <  `$ ??B   HDatabase Systemsc <  `) ?? p J7  <1980c <  `p/ ?? p 7  XUllmancH < 0޽h ?<<<<<<<<<<<<<<<<<<<<<<< <<< <<< <<< << ̙33  $(  r  S `   r  S Pp  H  0޽h ? ̙33  $(  r  S x~`0   r  S   H  0޽h ? ̙33r( hސ%' \;u  !#$%&'()*+,-./0123456789:;<Oh+'0 `h  PowerPoint Presentation IBM_Usert P IBM_Usert P71_Microsoft PowerPointon@l@e@_b G g  }6& &&#TNPP(2OMi & TNPP &&TNPP    --- !-----yPH--w@ ww0- @Times New Romanww0- 33.62 jManaging Semi/Unstructured DataB/ (& &.--i-- 33@Times New Romanww0- .2 )2Mukesh Mohania! ! .@Times New Romanww0- .(2 N=IBM India Research Lab     .@Times New Romanww0- .2 u1mkmukesh . . 2 u@in.. . 2 uibm.. . 2 u.com .--"System 0-&TNPP &՜.+,0    On-screen ShowIBM!) 4Times New RomanArialTahoma Arial Narrow