I'll focus my answer to the problem of reading/writing binary data. Using plain text to transfer information between programs always strike me as utterly inefficient anyway (even if it is often greatly appreciated in practice), and reading/writing from a user terminal doesn't need to be efficient.
Claims like "utterly inefficient" are so much marsh gas without actual numbers. Here's one example. I wrote a program that generates 2**22 bit strings from 1 to 32 bits each, stored in an array of unsigned long. The cost of generating them and writing them out 5 times was: text version: 3.95 user + 0.13 system = 4.08 total seconds. binary version: 5.06 user + 2.08 system = 7.14 total seconds. This really doesn't look like a victory for binary, especially when you realise that the binary version was 1.6 times bigger. I don't call nearly twice as fast and nearly twice as small "utterly inefficient", and I don't believe anyone else would either.
I refuse to focus on reading/writing binary data. I have several reasons for this.
(1) That's just not what Prolog is about. Prolog's strengths are in the area of text and symbolic processing; these days the top Prolog systems are also good at some kinds of constraint logic programming.
If you want C, you know where to find it, and you can call it from Prolog.
(2) As it happens, I am supervising a 4th year student who _is_ very VERY interested in processing binary data. MIDI data, in fact. But this has to be grovelled over byte by byte; using a block level interface for reading and/or writing MIDI data would be about as sensible as using hedge clippers to trim your fingernails.
This leads to my major point, which is that if you are DOING something with the binary data, chances are that the overheads of moving one item at a time between the Prolog world and the C world are dwarfed by the processing costs.
(3) I have used binary data transmission between programs. And I have done it both before there was such a thing as the System V ABI (where processor supplements are supposed to spell out record layout for C) and between systems that had no interest in being in any way like System V. I have experienced UNIX systems where three different compilers laid simple structs out three different ways (not realised until _after_ the data failed to read). I have transmitted data from a 5-character-per-word machine to a 2-character-per-word machine via a 6-character-per-word machine. (I never want to do that again, especially if paper tape is involved.) I have spent hours sweating over how to convert between IBM mainframe, VAX, IEEE, and sort-of- like-IEEE-but-not-really floating point formats. I have found systems where one compiler could generate any of six different record layouts depending on compile-time options. I have had to deal with binary data containing addresses where a program was loaded at a different address each time it was run (and I see that some version of BSD is bringing this back in the interests of defeating the stack-smashing bandits). I've heard horror stories from friends about two C++ programs compiled with the same headers being unable to share data in a shared memory region (because the vtables were different). Then there's the fact that the processor- specific System V ABI supplements have nothing to say about record layout in Fortran or Pascal, let alone Ada or Eiffel.
Put it this way, using binary formats to transfer information between programs that differ in even the slightest respect (such as being two different processes running the *same* executable on the *same* machine at the *same* time) may be efficient, but it's crazy.
Here, of course, I'm referring to NATIVE binary formats. If you talk about things like ASN.1 or XDR, most of the difficulties I mentioned go away. But so, of course, does the claimed efficiency advantage of binary representations.
(4) There's a vast tide of information flowing over the net. A lot of it is pictures and movies, which Prolog should be delegating to other processors anyway. Almost all the rest is text.
All other information sources (disk files, sockets, unix pipes...) can typically be accessed a block at a time. Actually, in Unix, the native OS call is the block-oriented 'read' and 'write' for all types of file descriptors, even for conceptually character-oriented pipes. This is highly relevant to how the bottom layer works.
It is highly IRRELEVANT to how the top layer works. This is, after all, the point of distinguishing layers. I have already shown in some detail how a single-byte-at-a-time interface (to the Prolog world) can be as efficient as anyone could wish for, built on a block-at-a-time C world layer.
There are three activities: *generation* (the top layer's "writing" facilities) *transmission* (the bottom layer's concern) *parsing and processing* (the top layer's "reading" facilities) It is important to decouple these activities.
I note, by the way, that one of the important points for the Quintus block layer was the fact that some important operating systems distinguish logical blocks and physical blocks, and what they actually _give_ you is physical blocks, which you have to reassemble yourself. If the top layer works in terms of blocks, there isn't the slightest reason to expect _those_ blocks to have any simple relationship to the blocks exchanged with the operating system.
Even in UNIX, it may be necessary to issue several read() or write() requests to transmit a single logical block, and you don't know until you try whether a single read() or write() request will actually suffice.
> OK, what are the type and mode of Block? As you point out, it has to be -, and the only reasonable data structure it can return is some kind of native packed string (like in SWI Prolog).
SWI Prolog programmers routinely use atoms where other Prolog programmers might have used character lists. SWI atoms are compact, and garbage collected. They are well suited to holding text that you are just going to pass on. They are NOT convenient for generation or parsing and processing.
But the overhead of having to allocate that memory is minimal.
Well, no. On a 250MHz machine, I measured the cost of creating a 5-character atom as 33 microseconds. The cost of calling get_byte/1 5 times was measured as 17.5 microseconds. The cost you dismiss as "minimal" is in fact nearly twice the cost of single-byte input.
It is an amortized constant.
It is not amortized and it is not constant. A new object has to be created *EACH* time. This is Prolog, not C.
Actually reading data into that buffer is a single system call. There is no need for any loop in your program, the OS kernel handles all details.
This has never been true in UNIX. If you want to read N bytes, where N > 1, you MUST have a loop, because the operating system is under no obligation whatsoever to deliver all N bytes at once. Indeed, it may be unable to. (The same applies to write().)
You just cannot make it more efficient on a Unix architecture (disregarding mmap(), which I concede has strong limitations).
I _agree_ that reading and writing blocks is the right way to go IN THE BOTTOM LAYER. That's why Quintus did it that way; layering get0/1 on top of glue code on top of getc() on top of RMS on top of QIO was seriously inefficient (almost, but not quite, as bad as Java). Moving to a block-oriented bottom layer with "universal" mapping above meant that we just had get0/1 on top of glue code on top of QIO. As I've already shown, the fast path for the glue code can be EXTREMELY fast, although the VAX, M68k, and /370 were sufficiently register poor that it wasn't practical at the time to make it _quite_ that fast.
On Unix applications, implementing block-reading in term of character-reading is an abstraction inversion (http://cliki.tunes.org/Abstraction%20Inversion) because the kernel natively provides block-reading. This is argument by alluding to intimidating authority, and as such uncompelling.
It's basically uncompelling for three reasons: (1) I am not at all arguing that block reading should be implemented in terms of character reading. I'm arguing precisely the opposite: that character reading is practically always the right abstraction for Prolog, and that it should be implemented in terms of block reading.
(2) the abstraction that matters is the abstraction your program NEEDS not the abstraction your operating system provides. I keep on trying to make the point that it is a waste of time worrying about the raw transput costs if you don't worry about the generating and processing costs. A program whose cost is dominated by raw transput is a program which isn't _doing_ anything much with its data, and about the only useful program of that kind is 'cat'.
(3) The block abstraction provided by UNIX is ***not*** "the blocks you find convenient in your source code" but "the block fragments that the kernel finds convenient to accept or return". In the plainest possible terms, the block transput abstraction provided by UNIX is *not* the abstraction any program which generates or processes data wants; only a program whose task is simply to *route* data can be satsified with it directly.
As witness: fread() and fwrite(), which do NOT map onto single read() and write() calls.
> To wrap up, there is a logic programming language with Prolog-like > syntax in which it *is* possible to do byte-block-at-a-time I/O > without breaking the language's semantics, and that is Mercury. You are too concerned about the allocation cost. A lot of high-level languages more dynamic than Mercury have immutable string-like buffers and very successfully return them as a result of a block-reading operation (http://www.python.org). I am concerned about the allocation cost because I have *measured* it, it is an *avoidable* cost, and because it is the cost of making most Prolog I/O *harder* to write correctly, not easier. ============================================================================== Message: Address: Action: help majordomo(a)clip.dia.fi.upm.es Info. on useful commands subscribe ciao-users-request(a)clip.dia.fi.upm.es Subscribe to this list unsubscribe ciao-users-request(a)clip.dia.fi.upm.es Unsubscribe from this list <whatever> ciao-users(a)clip.dia.fi.upm.es Send message to list ----------------------------------------------------------------------------- Archived messages: http://www.clip.dia.fi.upm.es/Mail/ciao-users/ -----------------------------------------------------------------------------
Hello Richard,
On Mon, May 19, 2003 at 10:54:18AM +0200, Richard A. O'Keefe wrote:
text version: 3.95 user + 0.13 system = 4.08 total seconds. binary version: 5.06 user + 2.08 system = 7.14 total seconds.
By my definition 'text' streams are understood to be a subset of all 'binary' streams. I cannot comment these numbers more without any clue about how the files were actually encoded.
If you want C, you know where to find it, and you can call it from Prolog.
How then do I efficiently delegate to my C snippet a block of data that my Prolog program should acquire from some external source ?
(3) I have used binary data transmission between programs. (...)
I understand your grief very well. That's why my original statement contained the note "even if [text] is often greatly appreciated in practice". Any stream format is essentially a convention between the producing and the receiving ends, and today's computer architecture and the available tools tend to favour conventions that are immediately human-readable.
Here, of course, I'm referring to NATIVE binary formats. If you talk about things like ASN.1 or XDR, most of the difficulties I mentioned go away. But so, of course, does the claimed efficiency advantage of binary representations.
I am not particularly talking about 'native' binary, if you mean by that processor- or whatever-dependent format. I am not particularly talking about standards like ASN.1, either. There is plenty of room inbetween, where you assign a well-defined but custom meaning to a sequence of bytes (as opposed to a sequence of words or lines).
SWI Prolog programmers routinely use atoms where other Prolog programmers might have used character lists.
I am talking about 'SWI strings': atom-like objects that are not guaranteed to be unique. These (not atoms) have a minimal allocation overhead which is 'amortized constant' with respect to their length (where 'amortized constant' has the precise complexity theory meaning).
Actually reading data into that buffer is a single system call. There is no need for any loop in your program, the OS kernel handles all details.
My mistake, there is indeed a loop needed here. However this loop will only ever execute a few iterations; the user time spent reading a block is, again, amortized constant in the length of the block on reasonable Unix implementations.
abstraction inversion (http://cliki.tunes.org/Abstraction%20Inversion)
The pointer was meant to clarify the precise definition in which I wanted to use the term.
Armin ============================================================================== Message: Address: Action: help majordomo(a)clip.dia.fi.upm.es Info. on useful commands subscribe ciao-users-request(a)clip.dia.fi.upm.es Subscribe to this list unsubscribe ciao-users-request(a)clip.dia.fi.upm.es Unsubscribe from this list <whatever> ciao-users(a)clip.dia.fi.upm.es Send message to list ----------------------------------------------------------------------------- Archived messages: http://www.clip.dia.fi.upm.es/Mail/ciao-users/ -----------------------------------------------------------------------------