Ciao-users December 2001

ciao-users@cliplab.org

11 participants
17 discussions

Re: Problems with prolog_sys module in 1.7#162
by Paulo Moura 21 Dec '01

21 Dec '01

Hi Francisco! > >>> Can someone please explain me why the initial call to statistics/0 >>> throws an exception if the module is already loaded? > ... Thanks for your explanation. One more question if I may ;-) In version 17#115 I needed to include in my programs the following directive: :- use_module(library(hiord_rt)). but this seams to be broken or changed in the new version. I get errors like: error(existence_error(procedure,/(user:call,7)),/(user:call,7)) when running programs that make use of meta-calls. Reading the documentation, it seams that the hiord_rt predicates have moved to the engine? Thanks, Paulo ----------------------------------------------------------- Paulo Jorge Lopes de Moura Dep. of Informatics Office 4.3 Ext. 3257 University of Beira Interior Phone: +351 275319700 6201-001 Covilhã Fax: +351 275319732 Portugal mailto:pmoura(a)noe.ubi.pt http://www.ci.uc.pt/logtalk/pmoura.html -----------------------------------------------------------

2 1

Re: Problems with prolog_sys module in 1.7#162
by Francisco Bueno Carrillo 20 Dec '01

20 Dec '01

>> Can someone please explain me why the initial call to statistics/0 >> throws an exception if the module is already loaded? In Ciao the shell acts as another module: you cannot call predicates of a module unless you import them. So, when you do: ?- ensure_loaded('ciao_aux17.config'). you have a memory image in the shell which contains the prolog_sys module, but you have not imported this module into the shell. This is done with: ?- use_module(library(prolog_sys)). Think of the shell as another module that you link together with your program in order to build an executable: the fact of linking them together does not make any of them to import the other one. E.g.: shell -imports-> config -imports-> prolog_sys You cannot call prolog_sys from shell. To effectively import one module into another you have to put the use_module. E.g.: shell -imports-> config -imports-> prolog_sys | ^ |-----------imports----------| Now you can call prolog_sys from shell. Paco Bueno

1 0

Re: Database and memory limitations
by John Fletcher 20 Dec '01

20 Dec '01

Richard I accept that there are some users for whom the ability to have a complete DTD as part of the file is important. It's on the "to do" list. On the "done" list is the removal of the "editorial" remarks from the xml.pl page, which might have given the impression that DTD handling was left out on principle. I'm persuaded that for version 2.0 (RSN): 1) The default should be to merge CDATA and PCDATA for input. We still need to be able to write CDATA, for backwards compatibility, so we should be able to read it independently, for symmetry. 2) The default should be to ignore comments. We still need to be able to write comments, so we should be able to read them for symmetry. (Note: Even the latest "standards compliant" browsers - IE6.0 and NS6.1 need their in-line Javascript in a comment, even in XHTML.) 3) To get XHTML formatting right, both for input and output, we need to validate. To make validation worthwhile, I'll want more than the DTD offers: I want to check more of the validity constraints and more of the attribute datatypes. It would be good to interleave the validation with the parsing too. I don't know which schema language would be best - I'll code what I need directly in Prolog to start with, and see where that leads. Regards John Fletcher ----- Original Message ----- From: "Richard A. O'Keefe" <ok(a)atlas.otago.ac.nz> To: <ciao-users(a)clip.dia.fi.upm.es>; <john(a)binding-time.co.uk>; <ok(a)atlas.otago.ac.nz> Sent: Tuesday, December 18, 2001 10:46 PM Subject: Re: Database and memory limitations > > (b) Binding Time may think it "unlikely that an application could make any > > > use of a document that defines its own" DTD, but I have numerous > > > examples. Since an application doesn't get its semantic information > > > from a DTD in the first place, the reason given is unsound. > > > > I wasn't saying that the semantic information was in the DTD, > > rather that applications must recognize the nodes, and probably the > > structure of the document, to "make sense of it". > > > Yes, the application must recognize the nodes. > No, this recognition does NOT have to be by means of checking > the element type. If we have > > <!ELEMENT title (#PCDATA)> > <!ATTLIST title ARCFORM #FIXED "h2"> > > then an application can look at the ARCFORM attribute of an element > and say "oh yes, this is a level 2 heading, I know what to do with that". > Note that the content model of this <title> is a strict sublanguage of > the content model of an <h2>; in general, the thing that matters is that > the content AFTER MAPPING should be a sublanguage. > > If we make that a default, instead of a fixed, attribute, > then particular instances can over-ride it. > > <!ATTLIST title ARCFORM (h1|h2|h3|h4|h5|h6) "h2"> > > so <title> maps to h2, and <title ACRFORM="h1"> maps to h1. > > > This means that some fixed terms must have been defined > > beforehand, assuming that we're talking about communication > > amongst two or more systems. Why not map these terms onto a > > fixed DTD, or Schema, rather than adding another level of indirection? > > > Because a "union" DTD or Schema will be very permissive. > (Look at HTML, for example. The separation between block level and inline > level is _not_ very clear; inline content is allowed in most places where > you'd expect block content.) > > Because the terms of such a DTD or Schema may not be the best ones for > the task at hand. > > Let's take just one more example. Imagine that I'm writing a specific > document, and in this document I need to have a list of people. Each > person has a name and a list of projects. Then I want to use tags > like <name> and <projects> rather than <dt> and <dd>. And I want to > control the content model of these things; <projects> must contain > project description, not arbitrary HTML content. > > This happens to be an example I'm actually editing right now. > I have *ONE* such document. When the document is revised, the chances > are the grammar will be revised as well. There is no point in making > the DTD a separate file. > > > > More > > > interestingly, IBM's DARWIN approach shows that an application _can_ > > > get a lot of semantic information from a DTD, via the systematic use > > > of #FIXED attributes. > > > > Defining the meaning of an element through a large set of > > "#FIXED" attributes seems back-to-front to me. I would choose > > to have a fixed set of tags, and as many attributes as are > > needed, using the attribute values to parameterize the semantics. I > > think that is more in the spirit of XML. > > > I strongly recommend that nayone who is interested in SGML/XML processing > should look at the DARWIN approach. They show how you can implement an > object-oriented model in SGML, where you can say "this element type is like > that element type with these extensions". > > I would say that using a fixed set of tags is about as opposed to the > spirit of XML as you can possibly get. XML is about using *semantic* > markup, and that specifically includes applicatin-speficic and even > document-specific element types. Attributes should be inferred by > the processor from content and context whenever possible. > > > > The thing which makes the Binding Time parser unusable to me in its > > > present form is that it's based on the usual mish-mash of markup-sensitive > > > and structure-controlled approach. > > > > I think it might be XML that's the "mish-mash", > > XML provides two parsing models: validated, which fits structure- > controlled applications very well, and well-formed, which is the mish-mash > I am complaining of and is suitable neither for markup-sensitive applications > (because information they might care about is lost) nor structure-controlled > applications (because the information they need isn't there). > > > xml.pl is trying to simplify it as far as possible - arguably > > farther than is possible. Nevertheless, I think it gives a good mix of > > generality and ease of use. > > > Except that it gets the rules for white-space handling in attribute values > wrong, and it doesn't let you use general entities to build documents out > of pieces, and it doesn't handle white space in text correctly, and ... > > I actually tried the " x y " example, and xml.pl > did produce the wrong answer. And this is ultimately related to the fact > that it doesn't look at the DTD. > > > > In any XML parser, it ought to be > > > possible for an application to say "I am a structure-controlled > > > application. Do NOT split CDATA out separately. Act as if comments were > > > not there at all. DO distinguish element content white space from other > > > white space, in fact, don't give me any element content white space." > > > > The problem with both CDATA sections and Comments is that some > > applications, like XHTML and SVG, expect JavaScript to be > > delivered in them, so the default behaviour has to be to > > preserve comments on input and to distinguish between PCDATA and > > CDATA for output. > > > There are some non-sequiturs there. > > Yes, XHTML expects Javascript, > yes, Javascript *****may***** be embedded in CDATA, > no, Javascript does not *have* to be embedded in CDATA, > so an XHTML processor that handles Javascript (which many do not) > has to recognise Javascript WHETHER IT IS IN A CDATA SECTION OR NOT. > In fact, having to distinguish between CDATA sections and other character > data makes an XHTML processor's job *harder*, not easier. > > XHTML has CDATA sections precisely so that Javascript should NOT > be embedded in comments. That's an old hack for HTML, not for XHTML. > Anyone who puts Javascript in an XHTML comment deserves to have it ignored. > > But above all, the fact that XHTML or SVG require this (if they do), > does ***not*** mean that distinguishing CDATA from other character > data and reporting comments have to be the defaults. It only means > that they have to be possible, so that XHTML and SVG applications > can ask for them. There are very very many uses of XML that are not > XHTML, not SVG, and do not include Javascript. > > > It's a pragmatic requirement, until all applications recognize > > CDATA sections and PCDATA as interchangeable and stop using comments > > corruptly. > > > It is a pragmatic requirement that these things be *POSSIBLE*, > not that they be *defaults*. The distinction having been enshrined in > SAX, DOM, and Infoset, what are the odds that generators ever get it right? > > > Beyond that, I've elected to have the calling application ignore > > what it doesn't need, rather than provide switches in the > > parser. If a consensus in favour of "switches" emerges, that will > > change. > > > But there ARE options in the parser. The first argument of xml_parse/3 > is a list of them. Currently there are two: > format(bool) true ->strip white space (incorrectly) > extended_characters(bool) true -> use XHTML character entities. > Why not > cdata(bool) true -> return cdata(_) terms > comment(bool) true -> return comment(_) terms > > Electing to have the calling application ignore what it needs not to have > puts the burden on the application. It doesn't make sense to me to force > applications to deal with constructs that have no value to them. > > It's worse than that. If I have > Example: <![CDATA[<foo bar="ugh">]]>. > what I _want_ is > element(p,[],["Example: <foo bar=""ugh"">.]) > but what I _get_ is > element(p,[],["Example: ",cdata("<foo bar=""ugh"">"),"."]) > which isn't even the right number of children. > > I don't see why every application should have to include its own > code for the common task of stripping out comment nodes, and > pasting sequences of plain text and cdata into single plain text items. > > It is more efficient to have a means of never generating these things > in the first place. > > Second best would be for the package to include an > xml_normalize(Kludgy, /*->*/ Cleaned) > predicate. > > > > You really can't get XHTML white space handling right without knowing > > > what is element content and what is mixed content, which means processing > > > the DTD. > > > > I've fixed my "defaulty" explanation, if not the code. (One > > could fix that specific XHTML problem, simply by distinguishing > > between 'block' and 'inline' elements, rather than processing > > the whole DTD. It's a hack, but it would be an effective one.) > > > It would be, if I were parsing XHTML, which I'm usually not. > > > I think DTDs are very high cost/low value in most cases. For > > example, I'm sure that there must be a way of capturing more of > > XHTML's validity constraints, more economically, than the DTD, or XML > > Schema, manages. For "architectural forms", #FIXED attributes seem > > rather limited when compared with XLink (URLs) and Namespaces. > > > DTDs are very low cost compared with Namespaces, and trivial compared > with the cost of XLink. Of course there's a better way to capture that > stuff, it's called Prolog. Namespaces are very easy to implement in an > XML parser (I've done it), but they impose a high cost in every application > that uses them, because they are so clumsy. DTDs have some cost in the > parser, but many applications can benefit from knowing they have a structurally > well-formed document. (Not HTML or XHTML ones, of course, because HTML and > XHTML have so little structure.) > > DTDs can do more for you than most people realise. > They were by design limited to what could be efficiently implemented in > limited memory, but you can do quite a lot. > Modular XHTML makes good use of them. > > There are other schema languages for XML, such as Relax and Trex and ... >

1 0

Problems with prolog_sys module in 1.7#162
by Paulo Moura 20 Dec '01

20 Dec '01

Hi! When I consult a file that contains the call: :- use_module(library(prolog_sys)). I get the following results: Ciao-Prolog 1.7 #162: Tue Dec 18 11:57:55 WET 2001 ?- ensure_loaded('ciao_aux17.config'). yes ?- predicate_property(write(_), P). {ERROR: user:predicate_property/2 - undefined predicate} no ?- statistics. {ERROR: user:statistics/0 - undefined predicate} no ?- use_module(library(prolog_sys)). Note: module already in executable yes ?- statistics. memory used (total) 2957462 bytes program space (including reserved for atoms): 1680214 bytes number of atoms and functor/predicate names: 4253 number of predicate definitions: 2425 global stack 527988 bytes: 1352 in use, 526636 free local stack 16380 bytes: 196 in use, 16184 free trail stack 16244 bytes: 196 in use, 16048 free control stack 16516 bytes: 468 in use, 16048 free 0.000 sec. for 1 global, 0 local, and 0 control space overflows 0.000 sec. for 0 garbage collections which collected 0 bytes 0.002 sec. runtime, 110.526 sec. walltime yes ?- Can someone please explain me why the initial call to statistics/0 throws an exception if the module is already loaded? Thanks, Paulo ----------------------------------------------------------- Paulo Jorge Lopes de Moura Dep. of Informatics Office 4.3 Ext. 3257 University of Beira Interior Phone: +351 275319700 6201-001 Covilhã Fax: +351 275319732 Portugal mailto:pmoura(a)noe.ubi.pt http://www.ci.uc.pt/logtalk/pmoura.html -----------------------------------------------------------

1 0

Ciao error output
by Kris Gybels 19 Dec '01

19 Dec '01

Hi all, Just a newbie question: why don't compiled ciao programs output errors but just hang?? This is driving me nuts! I have a program that uses an undefined predicate. When run from the shell, the output '{ERROR undefined predicate ...}' is output as I expected. However, when I compile this program and then run it just hangs. It is not interruptable by doing CTRL-C either. Is this the normal behaviour for ciao or is there something wrong? If it is the normal behaviour is there at least some way of forcing compiled programs to output errors instead of just hanging?? S!

1 0

Re: Database and memory limitations
by Richard A. O'Keefe 17 Dec '01

17 Dec '01

"John Fletcher" <john(a)binding-time.co.uk> wrote: TBH, I'm not convinced by SWI's approach of encoding text nodes (CDATA and PCDATA) as atoms. It is a compact representation, but often the content of the text nodes (including attribute values) has some internal structure that needs to be "micro-parsed". I've been using Prolog to read and write XML quite a lot, and I found the "text as atoms" approach meant having to convert between atoms and chars far too often. I was describing what SWI's library does, not what I actually prefer. My uses of SGML are predominantly textual, and I find that I prefer using lists of character codes to represent text. (Amongst other things, that makes it easy to match, generate, and transform text using DCGs.) There are plenty of intermediate possibilities, such as representing text as lists of tokens. It might be that the XML applications I've been dealing with: XHTML, SVG and SMIL, are especially prone to this, but leaving the text nodes as chars has worked to advantage (see http://www.binding-time.co.uk/xml.pl.shtml ). The files themselves tend to be quite small, (average 10k maximum 250k, they are intended for transmission after all), so memory usage hasn't been a problem. ... Perhaps RDF is different in this respect. There are some good things about the SWI kit: (1) The source is available; if you don't like atoms for text, don't have them. (2) The parser is fast. It's not the fastest XML parser around, but it's the fastest SGML parser I've managed to get my hands on. (3) It's an SGML parser, not just an XML parser. (4) Because the actual parser is in C, it's comparatively straightforward to plug it into languages other than Prolog and dialects other than SWI. Thanks for the link to http://www.binding-time.co.uk/xml.pl.shtml. There are a number of points I would take issue with there: (a) ' may not be a standard _HTML_ character (heck, HTML3.2 accidentally dropped ") but it _is_ a standard XML character, and is explicitly present (the fifth) in "-//W3C//ENTITIES Special for XHTML//EN". (b) Binding Time may think it "unlikely that an application could make any use of a document that defines its own" DTD, but I have numerous examples. Since an application doesn't get its semantic information from a DTD in the first place, the reason given is unsound. More interestingly, IBM's DARWIN approach shows that an application _can_ get a lot of semantic information from a DTD, via the systematic use of #FIXED attributes. The thing which makes the Binding Time parser unusable to me in its present form is that it's based on the usual mish-mash of markup-sensitive and structure-controlled approach. In any XML parser, it ought to be possible for an application to say "I am a structure-controlled application. Do NOT split CDATA out separately. Act as if comments were not there at all. DO distinguish element content white space from other white space, in fact, don't give me any element content white space." The Control "strip layout when no character data [sic.; but layout IS character data] appears between two elements" is blessed by XSLT, so is a handy thing to have, but gets the "element content white space" notion quite wrong; the result is that if you write "<ul> <li>x</li> <li>y</li> </ul>" it is as if you had written "<ul><li>x</li><li>y</li></ul>", which is GOOD, but if you write " x y " it is as if you had written "xy", which is BAD. You really can't get XHTML white space handling right without knowing what is element content and what is mixed content, which means processing the DTD. Another reason for processing the DTD, of course, is to get #FIXED attribute values, defaulted attribute values, and special attribute types such as NMTOKENS, correct. #FIXED attributes, as I mentioned above, are an extremely useful tool for "poor man's architectural forms". Finally, while trying to download, I got several 404 reponses, with text "The requested URL: /cgi-bin/USER-linklog.pl was not found on this server"

1 0

RE: Database and memory limitations
by Dupont, Michael 14 Dec '01

14 Dec '01

Richard, Thanks for your tips, and thanks to all people on this list. I am very excited about the resonance I get from the prolog community, the gcc compiler community proper is not that interested in this project or any project like it. It seems that many people here have sympathy for the idea of extracting meta data from c/c++ programs. To answer your question, >>Is there any redundant information? yes there is much redundant information, for example, I have one output file per input c file from the compiler, plus one file per function that is compiled in each module, each time it is declared c file or (in lines appear all over the place). >>Could the information be put into a CDB file and Ciao's memory be >>used as a cache? I was hoping that that would work. The set of all global information for a c program is not that large, types and functions, these should be compressed down. The files that I have are around 10-20 MB per source file for the translations of the gcc sources themselves. My original memory limitations were with gnu prolog, I must admit I have not tried with ciao yet :(. >>Is there information which is seldom needed, so it could be loaded on >>demand? the bodies of the functions can be loaded on demand, the usage information of data types is not always needed. I have switched the processing to Perl for a while, but I really did like working with prolog, also because of the ability of querying. This weekend I will send out and update on the project with all newer source code and example XML files to the project page at http://sourceforge.net/projects/introspector/ Mike I will be working from my mdupont777(a)yahoo.com account this weekend. -----Original Message----- From: Richard A. O'Keefe [mailto:ok(a)atlas.otago.ac.nz] Sent: Donnerstag, 13. Dezember 2001 17:18 To: ciao-users(a)clip.dia.fi.upm.es Subject: Re: Database and memory limitations Manuel Carro <boris(a)aaron.ls.fi.upm.es> wrote: I find it of interest that you are transforming xml datasets into prolog with xsl... specifically the reason your snippet caught my eye is I'm about to try out some previous work with Topic Navigation Maps with Prolog (which is new to me), well basically to see what fits well and what doesn't. I note that SWI Prolog comes with an SGML parser which supports XML, including XML namespaces. This package has particular support for RDF. I don't know whether Ciao's and SWI's licences are compatible, but it might be worth looking into. I'm told that SWI Prolog is being used to process 90MB RDF files. I also note that Prolog is vastly more convenient for XML processing than XSLT is. Prolog "Document Value Model" data structures for representing XML are pretty much bound to be much cheaper than the "Document Object Model" data structures used by most XSLT processors, if you have a reasonably compact representation for text. (SWI Prolog uses garbage-collected atoms for this.) My own experience is that having Prolog, Scheme, and Haskell available it'll take a gun pointed at my head or an extremely large bribe to make me use XSLT for anything. I suspect that the fundamental problem is with the representation that is being generated as the output of the XSLT processing step. Is there any redundant information? Is there information which is seldom needed, so it could be loaded on demand? Could the information be put into a CDB file and Ciao's memory be used as a cache?

1 0

Re: Database and memory limitations
by John Fletcher 14 Dec '01

14 Dec '01

----- Original Message ----- From: "Richard A. O'Keefe" <ok(a)atlas.otago.ac.nz> To: <ciao-users(a)clip.dia.fi.upm.es> Sent: Thursday, December 13, 2001 4:17 PM Subject: Re: Database and memory limitations > Manuel Carro <boris(a)aaron.ls.fi.upm.es> wrote: > I find it of interest that you are transforming xml datasets into prolog > with xsl... specifically the reason your snippet caught my eye is I'm about > to try out some previous work with Topic Navigation Maps with Prolog (which > is new to me), well basically to see what fits well and what doesn't. > > I note that SWI Prolog comes with an SGML parser which supports XML, > including XML namespaces. This package has particular support for RDF. > I don't know whether Ciao's and SWI's licences are compatible, but it > might be worth looking into. I'm told that SWI Prolog is being used > to process 90MB RDF files. > > I also note that Prolog is vastly more convenient for XML processing > than XSLT is. Prolog "Document Value Model" data structures for > representing XML are pretty much bound to be much cheaper than the > "Document Object Model" data structures used by most XSLT processors, > if you have a reasonably compact representation for text. (SWI Prolog > uses garbage-collected atoms for this.) > TBH, I'm not convinced by SWI's approach of encoding text nodes (CDATA and PCDATA) as atoms. It is a compact representation, but often the content of the text nodes (including attribute values) has some internal structure that needs to be "micro-parsed". I've been using Prolog to read and write XML quite a lot, and I found the "text as atoms" approach meant having to convert between atoms and chars far too often. It might be that the XML applications I've been dealing with: XHTML, SVG and SMIL, are especially prone to this, but leaving the text nodes as chars has worked to advantage (see http://www.binding-time.co.uk/xml.pl.shtml ). The files themselves tend to be quite small, (average 10k maximum 250k, they are intended for transmission after all), so memory usage hasn't been a problem. Perhaps RDF is different in this respect. AIUI, from http://www.xml.com/pub/a/2001/04/25/prologrdf/index.html , all of RDF's data values are URIs, so atom should be a good representation for them, unless you're interested in the URI's structure. Regards John Fletcher

1 0

Re: Constraints in Ciao Prolog
by Daniel Cabeza Gras 13 Dec '01

13 Dec '01

Dear Alexandre, No, CLP(Q/R) in Ciao is not the same as the IBM's clpr. The main difference is that in Ciao the constraints can only be imposed by a constraint builtin, there is no implicit constraint resolution in the unification. Just compare the Fibonacci function in IBM's clpr fib(0, 1). fib(1, 1). fib(N, X1 + X2) :- N > 1, fib(N - 1, X1), fib(N - 2, X2). and in Ciao's CLP(Q/R) fib(X,Y):- X .=. 0, Y .=. 0. fib(X,Y):- X .=. 1, Y .=. 1. fib(N,F) :- N .>. 1, N1 .=. N - 1, N2 .=. N - 2, fib(N1, F1), fib(N2, F2), F .=. F1+F2. The Ciao version is more "verbose", but in turn its semantics is well defined, can be extented to other constraint domains, and allows mixing traditional Prolog code with CLP code. This and other examples can be found in library/clpqr_src/examples from the Ciao main directory. The files ending in .clp are the original examples in IBM's clpr. Daniel

1 0

Re: Database and memory limitations
by Richard A. O'Keefe 13 Dec '01

13 Dec '01

Manuel Carro <boris(a)aaron.ls.fi.upm.es> wrote: I find it of interest that you are transforming xml datasets into prolog with xsl... specifically the reason your snippet caught my eye is I'm about to try out some previous work with Topic Navigation Maps with Prolog (which is new to me), well basically to see what fits well and what doesn't. I note that SWI Prolog comes with an SGML parser which supports XML, including XML namespaces. This package has particular support for RDF. I don't know whether Ciao's and SWI's licences are compatible, but it might be worth looking into. I'm told that SWI Prolog is being used to process 90MB RDF files. I also note that Prolog is vastly more convenient for XML processing than XSLT is. Prolog "Document Value Model" data structures for representing XML are pretty much bound to be much cheaper than the "Document Object Model" data structures used by most XSLT processors, if you have a reasonably compact representation for text. (SWI Prolog uses garbage-collected atoms for this.) My own experience is that having Prolog, Scheme, and Haskell available it'll take a gun pointed at my head or an extremely large bribe to make me use XSLT for anything. I suspect that the fundamental problem is with the representation that is being generated as the output of the XSLT processing step. Is there any redundant information? Is there information which is seldom needed, so it could be loaded on demand? Could the information be put into a CDB file and Ciao's memory be used as a cache?

1 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

Ciao-users December 2001