If you thought the one-line definition of the Fibonacci sequence in Haskell was beautiful, try this:
import Control.Monad powerset :: [a] -> [[a]] powerset = filterM (const [True, False])
A black box for sure, unless you have "internalized the list monad". But it does show how stonkingly powerful monads are.
(In response to Gimbo.)
Speaking of Unix, I have thought that there is a pretty close connection between lazy evaluation and Unix pipelines. On the xmonad home page there's an example for using dzen that looks like this:
while true ; do
date +"%H.%M %a %b %d"
sleep 60
done | dzen2 -ta r -fg '#a8a3f7' -bg '#3f3c6d'
Of course, if that was strictly/eagerly evaluated, dzen would never run, because of the infinite loop. But what actually happens is that the two commands, joined together by a pipe (indicated by the | character), are executed in parallel; the while loop runs date, producing some output, which it writes to a pipe (just a first-in-first-out buffer shared by the two processes); it sleeps for 60 seconds; then it runs date again; sleeps again; ..., ad infinitum. Meanwhile, dzen reads its output, until there is none left, then blocks until the loop provides more.
But that's just like a lazily evaluated stream. Suppose we had an IO action getTimeString, then we could do:
statusBarContents :: IO [String]
statusBarContents = do
sleep 60
now <- getTimeString
then <- unsafeInterleaveIO statusBarContents
return (now:then)
statusBar = dzen statusBarContents
Again, it looks like statusBarContents would never terminate because it calls itself, and not even in an inductive manner with a base case where it bottoms out. In contrast to the above, this is all done in one thread; but the unsafeInterleaveIO says "only evaluate this when it's asked for" - so it manages to produce one time string, which dzen uses; then dzen asks for the next one, and the sleep in statusBarContents makes it block.
(For the mathematically minded, this kind of non-inductive recursion producing infinte amounts of data is called corecursion. I intend to do a SUCS talk on the topic someday.)
I just wrote the following snippet of code for my dissertation:
intNatI :: Int :~= Nat
intNatI = Iso {
to = fromInteger . toInteger,
from = fromInteger . toInteger }The components to and from have totally different types (to is Int -> Nat, from is Nat -> Int). Yay for type inference! :)
Quoth Doaitse Swierstra, nevermore, > digest.chew.eat.serve.cook.chop.pluck.kill $ chicken > > we all have a definite feeling that after applying the functions, the > original object is no longer available, and the FP view does not feel > entirely natural. Yes, that is true. I can sort of see why people might object to the IO monad in this case, since it explicitly prevents the programmer from reusing old data. Of course, one might argue that it's not really the *monad* that prevents it, but the construction of the physical world. The monad merely prevents us from trying to eat our cake and have it.
Dougal Stanton on haskell-cafe (archive).
For the uninitiated, the dots refer to function composition; in OO notation it would look like chicken.kill().pluck().chop().cook().serve().eat().chew().digest(). The point is that that kind of composition in Haskell implies there are no side effects. Consider:
let chicken = getChicken
in (chicken,
(digest.chew.eat.serve.cook.chop.pluck.kill) chicken)
Here we've taken a chicken, eaten it, and returned the result along with the original chicken — i.e, we eat our chicken and have it. This is what the IO monad stops you from doing, and why composing functions that perform side effects is difficult.
This is perhaps a good introduction to a rather interesting video I also saw linked from haskell-cafe (in fact in the same thread). It's only available in WMV unfortunately (it does come from MS after all).
It's been known since the time of Turing that what's computable in one reasonably powerful programming language is computable in every other one. As a result, we can say that large classes of programming languages are 'equivalent' in the sense that they can emulate each other. However, not all programming languages have natural equivalents of all the others' features.
This is the idea behind Greenspun's Tenth Rule:
Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.
The usual corollary is "including Common Lisp". We could say the same for C: most C compilers are implemented in C and compile themselves. Similarly, the Glasgow Haskell Compiler is written (mostly) in Haskell. Actually, given that C is the implementation language for so many interpreters, it's easy to see that one could, in principle, write a program in language X by directly creating the abstract syntax tree and passing it to an "evaluate" function, all in C.
Changing tack slightly, let's take a brief look into the insides of Milliways III. (I hear some of you saying "Eugh!" :) Currently each user has their own process, and each process has a named pipe. To send a message to a user, another process (which must be owned by the same user, hence the executable has to be setuid) writes to this pipe with the message in a particular format. Currently there is no locking, but all writes are of sizes below the threshold which guarantees that the write is atomic, so there are no race conditions. Relying on such a specification is rather ugly, plus there are many different places in the source which perform this writing. It would also be interesting to make Milliways III a Marvin client, making it possible for everyone to use Marvin without forsaking the features of mw3.
To this end, I created the newipc branch, and set about designing a higher-level message-sending function. Most of the places that write to the pipe do so in exactly the same way; there is a lot of boilerplate code in there. So it is possible to replace these with one function that decides what to do according to the data sent to it by the calling function.
Here is the prototype of the main abstraction in ipc.h:
typedef char * (*msgtest)(struct person *, void *); void mesg_filter(msgtest test, void *testobj, enum mesgtype type);
The first line defines the function pointer type used in the second. In this way, mesg_filter is what's known in the functional programming world as a higher-order function: it takes a function as an argument. Recall the signature of qsort in the standard C library:
void qsort(void *base, size_t nmemb, size_t size, int(*compar)(const void *, const void *));
qsort takes a comparison function (compar) as an argument, to decide whether one element of the array is less than, equal to or greater than another. Similarly, when some part of Milliways wants to send a message to another user, it calls mesg_filter, passing in some arbitrary data along with a function which is given a struct person to decide whether or not to send a message to this person, and if so, what message. If it shouldn't send anything, it returns NULL. The arbitrary data typically includes a pointer to the message, so the function can vary its behaviour depending on the type of message being sent.
In C, functions are not first-class objects. Basically what this means is that you can't create functions on the fly - they have to be declared beforehand. "How do you create a function on fly?" I hear you ask. Basically you do this by creating a closure. Here is a trivial example in Python:
def mk_adder(n): def addn(m): return n + m return addn
Suppose we do fn = mk_adder(3). Then fn is a function which takes a number and returns the sum of it and 3. fn is a closure. What happens here is that the Python interpreter notices that addn (the inner function) uses the local variable n, and so it makes note of it before returning the created function. Actually, what it returns is not an ordinary function, but an object that can be called like a function, with some extra information specifying the value of n. This is a closure: basically a struct containing a) a function, and b) some data that the function uses.
In using mesg_filter, the function test that gets passed in also needs some extra information to do its job. Maybe we want to send a normal message, or maybe we want an emote. Additionally, people using mud-mode get different messages to those who aren't. The testobj pointer is a pointer to some arbitrary data structure containing this information. Together the function test and the black box data object testobj (note its type, void *, which means mesg_filter knows nothing about it except where it is in memory) constitute a kind of closure. Because C doesn't have the syntactic sugar to do this automatically, we have to do it manually instead, but in the process we've emulated part of a functional programming language in C.
You might be thinking that the concept of a function associated with some data sounds familiar. Consider the following C++ snippet:
class Msgtest { public: virtual char * test(struct person * p) = 0; }
Looks like just a function type with an abstract class wrapped around it, right? Well yes, but we can inherit from it and add our own data. (Equivalently, in Java we could define an interface with just this method and implement it.) With this abstract class, our "higher-order" message passing function gets this signature:
void mesg_filter(Msgtest test, enum mesgtype type);
All we've done is replace the void * pointer with an abstract class. The idea is that a calling function will define some class which inherits from Msgtest, and adds its own data fields. This is very similar to the idea of a closure. If we were really using closures, we wouldn't need to tell mesg_filter anything about extra data; we don't need the additional scaffolding of an abstract class definition or a pointer to a black box. Here's what the signature of mesg_filter would be in Haskell:
mesg_filter :: (Person -> Maybe String) -> Msgtype -> IO ()
The parenthesised part of the signature indicates that the function takes a function as an argument. All we need to note in the type of mesg_filter is that this function takes a Person, which is some data object containing information about the person we might be sending a message to, and possibly returns a string - if it doesn't, we don't send a message. No mention of the extra data, and no need even to acknowledge it, like we did in C using a void pointer or in C++ using an abstract class. (The Maybe even gives more information than the C and C++ signatures - it's obvious now that it might not return anything.) How, then, do we pass in the extra data we need to make decisions? Answer: closures.
sayto :: Person -> String -> IO () sayto person mesg = mesg_filter filter Text where filter :: Person -> Maybe String filter p = if p == person then Just mesg else Nothing
Similarly to the Python example, filter (a locally declared function) makes use of the local variables person and mesg. The function that gets passed to mesg_filter isn't an ordinary function: it's associated with these two extra data. From the point of mesg_filter, though, there's no difference.
Here's an interesting toy language/Turing tarpit for you: Lazy K.
You may already have seen Unlambda, which is essentially the lambda calculus in programming langauge form. (Actually, as its name suggests, it doesn't actually have any lambdas - it's really based on the equivalent SKI combinator calculus.) However, Unlambda is a horrible dirty - ahem, I mean impure functional language because 1) it isn't lazy and so programs in it can sometimes gratuitously fail to terminate, and 2) it has an exception to the evaluation rules (the d 'function') and side effects (the .x function prints the character x). Also, in Unlambda version 1 there is no way to do input.
Lazy K is no easier to write programs in than Unlambda (that's hardly the point, is it? :). But it solves all these problems: it is lazily evaluated; there are no side effects; and you can do input.
Wait a minute, I hear you cry, how can you do input with no side effects? Even Haskell, the paragon of freedom from side effects, has exceptions for I/O. Well Lazy K takes the common Haskell idiom of stream-based I/O and makes it mandatory. So a Lazy K program is just a function which operates on (infinite, lazily constructed) lists of natural numbers (i.e., Church numerals) to produce lists of natural numbers. This abstracts I/O out of the program, so rather than being side-effects, input is a front-effect and output a back-effect. And although it's in principle impossible to specify exactly when output happens, there are well documented ways of doing it in practice in the presence of lazy evaluation.
It's an interesting approach, and is exactly why Haskell programmers like the lazy stream I/O idiom so much; as the Lazy K page says:
Lazy K programs live in the same timeless Platonic realm as mathematical functions, what the Unlambda page calls "the blessed realm of the pure untyped lambda calculus." Just as garbage collection hides the process of memory management from the programmer, so referential transparency hides the process of evaluation. The fact that some calculation is necessary in order to view a picture of the Mandelbrot set, or in order to "run" a Lazy K program, is an implementation detail. That's the essence of functional programming.
There are an awful lot of monad tutorials out there. Monads as containers is one I saw a link to on haskell-cafe today, and it gives a rather amazing application of the list monad:
f x | x == '#' = "# #"
| otherwise = " "
"#" >>= f >>= f >>= f >>= f
= "# # # # # # # # # # # # # # # #"
And we have a fractal!
Gimbo, as many of you know, is learning Haskell, and just discovered the power and conciseness of lazy evaluation with a one-line definition of the entire Fibonacci sequence.
You can do a vaguely similar thing in Python. First we have a lazy list class:
class LazyList:
def __init__(self):
self._list = []
def __getitem__(self, key):
length = len(self._list)
if key.__class__ == int:
max = key
elif key.__class__ == slice:
max = key.stop
if max >= length:
for i in xrange(length-1,max):
self._list.append(self._gen.next())
return self._list[key]
To create the actual Fibonacci sequence, we define a subclass:
class FibList(LazyList):
def __init__(self):
def fibgen():
lastfib = 0
nextfib = 1
yield lastfib
yield nextfib
while True:
(lastfib, nextfib) = (nextfib, lastfib+nextfib)
yield nextfib
LazyList.__init__(self)
self._gen = fibgen()
Then a "list" (not really a list, but looks like one) of the Fibonacci numbers is
somefibs = FibList()
and we can look up the nth Fibonacci number, and take slices of it:
>>> somefibs[0]
0
>>> somefibs[3]
2
>>> somefibs[0:3]
[0, 1, 1]
>>> somefibs[2:20:4]
[1, 8, 55, 377, 2584]
But of course this isn't a one-liner. Plus we've fallen victim to a variant of Greenspun's Tenth Rule, substituting "Common Lisp" with "Haskell 98" :)
My 5 year old son writes proofs for very basic algebraic expressions at the ghci (Glasgow Haskell) prompt - with my assistance of course. He will not be introduced to imperative programming until he is "ready" - much like you wouldn't introduce the concept of war or politics to a small child.
— From a comment on an article comparing Scheme and Haskell as teaching languages.
A member of the role-playing society (who shall not be named, though some people reading will know who I am talking about) recently tried to set up a SUCS account for the role-playing society for the purposes of making a forum. I have expressed my opinions of forums before on this blog, and will be referring to them here. I would be interested in setting up a forum for the RP soc, except for my reservations about web forums.
Essentially, my major complaint about web forums is their interface. phpBB will email you to tell you of new posts, but these emails do not contain the post's content, and I assume they don't contain appropriate headers to let your email client thread them properly. More critically, however, it's not possible to post by email. Essentially, to interact with the forum I have to use a web browser, and web browsers are not well suited to content that updates irregularly. I want to do everything - read and post - within my email client. (A cynic would say that what I want is a mailing list. And they'd be right, except that I want people who want a forum to have one of those too.)
So I went googling for forum "mailing list" hoping to find a package that works as both. What I came across was WordPress.
WordPress (as many of my readers will know) is a sophisticated multiuser blogging system. It has multiple users and multiple levels of users, so you can let different users do different things. It lets you give a post a number of categories, or tags, which let you navigate old posts by topic. It seems WordPress has an option to make the front page of a blog present posts by topic instead of chronologically. So it would probably look very like a forum, as long as you categorise posts appropriately. You can even set up an email address that it checks periodically; any mail you send to this address will appear as a blog entry.
There are still a few essential features missing for it, but I think it would be possible to adapt it for use as a forum. In such a system "posting an entry" is isomorphic to "starting a new topic", and "posting a comment" is isomorphic to "replying to a topic/thread". With an email metaphor the only difference should be that in one case you post using "new email" and in the other using "reply".
For example, you would have tags for each game: "mage" for Mage, "vamp" or "requiem" for Vampire, "ce" for Cursed Empire, "iron" for Iron Kingdoms, etc. You have a tag for general discussions about the society, say "soc" or "rp". You have one for announcements, say "announce". You also have a "sticky" tag. You would be able to edit your WordPress templates so that the archive for each game tag arranges posts chronologically, but puts announcements and stickies first. The web interface, technically built on a date-sorted archive of posts which have no particularly firm connection other than what tags they have, need look no different to the traditional web forum.
There's a parallel here with Gmail, which assigns labels to emails, versus traditional email systems, which file them into folders. The system with tags has exactly the same functionality, but is far more flexible, because you can mix and match things as you want and customise what your client does with messages based on what combinations of tags they have.
Some of the features you would need are:
- The ability to read the blog, and comments, by email. This includes comment emails having the usual Message-ID: and In-Reply-To: headers that let your email client thread them.
- The ability to submit comments by email, ideally just by replying to the emails that represent posts and comments. You could probably use plus addressing for this purpose; e.g., to post a reply to http://sucs.org/blogs/pwb/entry/hello-world, you might send email to pwb+hello-world@blogs.sucs.org, and the received comment emails would have this address in the Reply-To: header.
- Some sort of auth system on these email addresses. Currently any email sent to the blog-by-email address, by anyone, becomes a post. A public key system might be a good approach, though just email addresses with appropriate forgery detection should be robust enough for some people's purposes.
- The ability to post comments that only a given level of user can read (for the RP soc, this would be posts that only the STs (GMs) can read, such as discussions of future plots). Currently you can lock away posts in WordPress, but you need to use a password, which is a rather awkward method. Anyone who manages teams of sysadmins will know this - giving out the root password means you have to change the password and distribute it, and have people accept and remember it, whenever someone's admin privileges are revoked. For a forum this also means you have to create a new password for each locked post, or agree on the same password for all such posts, which is even more awkward. Also bear in mind that people forget passwords sometimes. Better to base it on user level or groups.
Having discovered that Liferea displays web pages using Firefox's engine (and so you don't have to open all links in a separate program), I decided to re-add all the web comic feeds which don't embed the comics.
The last entry in the User Friendly feed was a feedback item about the "AJ in Nethack" story arc. Apparently someone has actually come up with a patch that implements the "Turbonerd" class which AJ played. Bizarre. No prizes for guessing what the deities are called :)
There was also a link to a server where you could play Nethack with this patch over telnet (nuclearwombats.net, port 20040). It's not all that mind-blowing (your starting weapon is a +0 cluebat :), but you can connect over IPv6. So now there are two useful (*cough*) things I can do with IPv6 (the other being connecting to the Freenode IRC network). Yay.
Edit: Damn, you can't connect over IPv6 after all. Seems I was just confused by it trying to connect, and didn't notice that it fails.
A couple of days ago I came across this quote: "Perl by itself is an OK OS, but it lacks a lightweight scripting language". I was immediately reminded of something else I'd read about Emacs, something to the effect that Emacs was a decent OS (or maybe shell) but lacked a good editor.
Now, before I start a flame war, I'll admit that Emacs is indeed a stonkingly powerful editor. The other major hacker's editor, Vim (a modern, more usable version of vi), is also programmable, but not nearly to the same extent that Emacs is - not to mention that Emacs Lisp is infinitely better documented than Vim's scripting language. There are lots of add-on modules to Emacs written by people who want it to do something it couldn't do before, and it's easy to change its behaviour if you want.
So before I learned to use Vim (currently the editor I use for literally all editing tasks), I had a go at learning Emacs. But I could never get used to the strange choices for navigation keys (up, down, left and right are (using Emacs notation) C-p, C-n, C-b and C-f respectively, which are allegedly mnemonic, at least to English speakers, but certainly not ergonomic) or the fact that you have to stretch your pinkie to get to the Ctrl key to use them (this reliance on shift keys being the source of one of the jocular expansions of EMACS: "Esc Meta Alt Ctrl Shift"). When I got round to learning to use Vim, I found that using Vim was simply much more comfortable - up, down, left and right are k, j, h and l respectively, all on the home row (assuming a qwerty-ish keyboard - mine was UK qwerty at the time, now a German qwertz) and unshifted.
But I have sometimes thought that Vim was not flexible enough, that I wanted to tweak some quirk of the current syntax highlighting or indentation mode, and was rather put off by the unique scripting language. So I made a mental note to learn Emacs, but never got round to it because I was so put off by the editing interface. Enter Viper.
Viper is a vi major mode for Emacs. A major mode determines the meaning of all the keystrokes - there are Emacs major modes for everything from text editing through reading mail and playing Nethack to consulting a robotic psychiatrist. (As in Eliza, not Susan Calvin. No, I'm not making this up.) It just so happens that some people who would like to use Emacs prefer the vi editing interface, and so they implemented a major mode to make editing in Emacs more like editing in vi. So I'm investigating it.
The most obvious differences I've noticed so far between Vim and Viper are undo (pressing u a second time in Vim undoes another action; in Viper it undoes the undo) and visual mode (in Vim; it's equivalent to setting the mark, moving somewhere and doing something with the region in Viper, except the region isn't obviously visible like in visual mode). Presumably it uses Emacs bindings for moving between windows too, instead of Vim ones (the latter being quite orthogonal - to move to the window above, you type C-w k). Time will tell whether I like it enough to switch.
Having found out that, unlike RSS, Atom allows embedded XHTML (and moreover is infinitely better defined), I've spent this evening converting my blog to emit it instead of RSS. So, if you're reading this at http://sucs.org/~pwb/blog/blog.py, you're looking at Atom with XSL instead of the old RSS with XSL. For some silly reason Atom still uses plain text dates instead of XML, but that's not actually a huge problem for the XSL as the format it does use, ISO 8601, can be manipulated using an XSL date-time library.
Annoyingly, it seems that although Planet understands Atom to a degree, it can only cope with content as CDATA (which ought not to be interpreted as XHTML elements, as Planet does, but rather as plain text). So my feed, whose content elements contain XHTML divs and are marked with the type="xhtml" attribute as the Atom spec dictates, are nonetheless stupidly interpreted as plain text. It's exactly the reverse of how it should be handled. So my old RSS feed is still there, at least until such time as Planet gets fixed, but is deprecated and will probably not get maintained any more.
I registered an account at VoIPuser on Saturday. I was under the impression that this would let me phone home 3 days later (a precaution to stop people abusing the ability to make free calls) but today I tried and SJPhone still said "Bad URL" (despite the URL being used, according to FireFury, being totally correct). I logged in to check that I hadn't forgotten something, and found a message saying: "You are not yet registered for a SIP account with VoIP User."
So I fill in the form that they are required to present me by law (name, address, etc.), hoping that since I opened my account 5 days ago, I should be able to dial out immediately. So I clicked submit, and what should I see but this message:
Your Free PSTN outbound access will be activated 3 days time.
Sigh, how fantastically useful. 3 days time is Christmas Day, and I will be back home by then :)
Just discovered Grip today, and I must say I was very impressed.
The very first thing I ripped with it was One More Mile by Muddy Waters, which is a double CD album. Obviously I'd like to listen to all of these in order without having to break to change CDs (or make a playlist especially for it), so I thought of editing the Vorbis comments after encoding to reflect this.
One of the great things about Grip is that it has space in several places for hooks, i.e. commands to run after ripping and encoding. My problem was solved in a very elegant way by writing a script that (with a little help from awk) runs vorbiscomment to add 22 to the TRACKNUMBER comment after encoding (22 being the track number of the last track on the first disc).
It strikes me that customisability of this sort is, in general, one of the big wins of the Unix toolbox approach. You might write a program for a specific purpose, but if you make it customisable enough, and especially if you give the user lots of hooks to run arbitrary programs, and your program uses open data formats, people can end up using it in very different ways. And that can only make your program more appealing and powerful.
