Drive ya Nuts
November 25th, 2010
Last time I was home I found my old "Drive Ya Nuts" puzzle. I don't remember ever solving it. Here it is, unsolved, in all its frustrating glory.
This Thanksgiving I'm on a long overnight in Nashville, TN so I wrote a prolog program to end this once and for all. New puzzles come solved in the retail box but there is no good reason to believe that this is the only solution.
use_module(library(lists)).
% Nuts.
nut(t, 1, 2, 3, 4, 5, 6).
nut(u, 1, 2, 5, 6, 3, 4).
nut(v, 1, 3, 5, 2, 4, 6).
nut(w, 1, 3, 5, 4, 2, 6).
nut(x, 1, 4, 2, 3, 5, 6).
nut(y, 1, 5, 3, 2, 6, 4).
nut(z, 1, 6, 5, 4, 3, 2).
% Rotation successors.
succ(r0, r1).
succ(r1, r2).
succ(r2, r3).
succ(r3, r4).
succ(r4, r5).
% A nut is zero rotated if marks match
% a nut definition.
rotate(r0, nut(Name,A,B,C,D,E,F)) :-
nut(Name,A,B,C,D,E,F).
% A nut is rotated R if there is a
% sequence of preceding rotations.
rotate(R, nut(Name,A,B,C,D,E,F)) :-
succ(PrevR, R),
rotate(PrevR, nut(Name,F,A,B,C,D,E)).
% Solution for the seven (*) pegs.
% Letter variables are matching edges.
% * I *
% J H
% C B
% * D * A *
% E F
% K G
% * L *
% Nuts are marked CCW starting from the
% right edge.
solution(Center, Right, TopRight, TopLeft, Left, BottomLeft, BottomRight) :-
Center = nut(N1, A,B,C,D,E,F),
Right = nut(N2, _,_,H,A,G,_),
TopRight = nut(N3, _,_,_,I,B,H),
TopLeft = nut(N4, I,_,_,_,J,C),
Left = nut(N5, D,J,_,_,_,K),
BottomLeft = nut(N6, L,E,K,_,_,_),
BottomRight = nut(N7, _,G,F,L,_,_),
rotate(r0, Center),
rotate(_, Right),
rotate(_, TopRight),
rotate(_, TopLeft),
rotate(_, Left),
rotate(_, BottomLeft),
rotate(_, BottomRight),
permutation([t,u,v,w,x,y,z], [N1,N2,N3,N4,N5,N6,N7]).
So....
?- [drive_ya_nuts].
% drive_ya_nuts compiled 0.00 sec, -48 bytes
true.
?- solution(A,B,C,D,E,F,G).
A = nut(w, 1, 3, 5, 4, 2, 6),
B = nut(y, 2, 6, 4, 1, 5, 3),
C = nut(t, 5, 6, 1, 2, 3, 4),
D = nut(v, 2, 4, 6, 1, 3, 5),
E = nut(z, 4, 3, 2, 1, 6, 5),
F = nut(u, 1, 2, 5, 6, 3, 4),
G = nut(x, 3, 5, 6, 1, 4, 2) ;
false.
?-
Yup, only one solution. I bought this puzzle used at a garage sale around 1990 so it hasn't been arranged for at least 20 years. I'd love to get some insight that makes the computer unnecessary but this is good enough for now. Happy Thanksgiving!
Named and optional parameters in C
July 20th, 2010
/*
This might be language abuse. Variadic macros, designated
initializers, structs, and compound literals makes it possible to
have named and optional parameters. Tested in gcc 4.5 and clang 1.1.
*/
#include <stdio.h>
#define foo(...) foo_((struct foo){__VA_ARGS__})
struct foo { const char *name; int number; };
void foo_(struct foo foo)
{
printf("%s: %d\n", foo.name ?: "untitled", foo.number);
}
int main(int argc, char **argv)
{
foo("three", 3);
foo("hello");
foo(.name = "zero");
foo(.number = argc, .name = "argc",);
foo(.number = 42);
return 0;
}
/* Output
three: 3
hello: 0
zero: 0
argc: 1
untitled: 42
*/
Tagged pointers: a fresh look at runtime dispatch in C
July 20th, 2010
I stumbled upon a technique that I think is quite good. It requires no casts, allows untagged structures and primitive types to be polymorphic variants, does not depend on structural inheritance, can statically overload a function name to operate on multiple types, can inline if the type is known at compile time, and will warn during compilation if a case is missing for a particular type.
Here's an example of a graph with different node types. I left out the obvious details for clarity.
/* The type tags */
typedef enum {
NODE_GEOMETRY,
NODE_TRANSFORM,
NODE_GROUP
} node_type;
typedef struct node_geometry node_geometry;
typedef struct node_transform node_transform;
typedef struct node_group node_group;
/* This is the tagged pointer representation */
typedef struct {
node_type type;
union {
node_geometry *geometry;
node_transform *transform;
node_group *group;
} raw_pointer;
} node_pointer;
/*
These structs are disjoint plain old data.
No structural inheritance. Yay.
*/
typedef struct {
mat4 matrix;
} node_transform;
typedef struct {
geometry_model *model;
} node_geometry;
typedef struct {
int member_count;
node_pointer members[NODE_GROUP_SIZE];
} node_group;
/* Regular monomorphic functions accepting a plain old pointer */
static void node_geometry_draw(node_geometry *);
static void node_transform_draw(node_transform *);
static void node_group_draw(node_group *);
/* Dispatching on the tagged pointer */
void node_draw(node_ptr node)
{
switch (node.type) {
case NODE_GEOMETRY:
node_geometry_draw(node.raw_pointer.geometry);
break;
case NODE_TRANSFORM:
node_transform_draw(node.raw_pointer.transform);
break;
case NODE_GROUP:
/* Notice the error: the type signature of node_group_draw() will catch it. */
node_group_draw(node.raw_pointer.transform);
break;
}
};
We typically tag an object for runtime polymorphism. C++ adds a vtable. GObject copies C++. Numerous tutorials on the web store type information with the object or struct.
But type information isn't always needed in a struct because the struct's complete type (including size and member offsets) is known by the compiler. Type information is only needed when an unknown structure is accessed at runtime through a pointer. From this perspective it makes at least equal sense to store type information with pointers and leave structures uncluttered.
This technique uses a tagged union to represent a pointer to plain old data. Functions are dispatched on the type contained in the pointer representation. Dispatching with a switch statement gives us the benefit of some compiler optimizations and heuristics. If the pointer's type tag is proved to be constant the switch can be completely elided and the case statement inlined. If there are a large number of type tags or they are numerically all over the place, the compiler can emit the best code for that situation.
This is a multimethod approach to polymorphism. It can result in a more fragile make tree, where changes to a single structure will cause more recompilation. That is also a result of code organization so I might be a tad hasty with that criticism. On the upside, dispatching on multiple types is obvious and I get warnings when there are missing cases.
Because runtime type is stored externally in pointers, data objects can take any form in memory. This is advantages if they are members in other structures since the containing structure does not need the type tag. It also allows for representations defined in other libraries or having constraints like vertex and connectivity data for OpenGL.
Lastly, structural inheritance is overrated. I like different structural types that can share an interface. This technique does that efficiently while using the language instead of abusing it.
Ruby on Rails 3.0.beta Knows Cows
February 6th, 2010
>> RAILS_GEM_VERSION
=> "2.3.5"
>> "cow".pluralize
=> "cows"
>> Rails.version
=> "3.0.0.beta"
>> "cow".pluralize
=> "kine"
>> # Huh?
?> "kine".singularize
=> "cow"
Irb is a Text Adventure
December 16th, 2009
Ruby's flavor of classed OO offers a thoughtful balance of rigidity and flexibility along with plenty of introspection and reflection. Over time I have come to experience this model less as a framework for thought and more as a living thing with which to converse. Dave Thomas brought up the prospect of a Ruby object browser over nine years ago. That thread raised questions that still haven't been answered.
Unfortunately for a browser the rigidity offered by Ruby is optional. I personally feel that HTML with links is the only graphical form that comes close to capturing the needed range. I just described a blank slate for a digraph so feel free to think it's lame. The object browser is possible and desirable but after all these years Irb is still the Ruby browser of choice.
My take is that every object in Irb is like a noun in a text adventure, and likewise every object is different. There are common verbs and uncommon verbs, and not every noun supports every verb. The components of trial, error, failure, and reward are the same. Nothing ruins my suspension of disbelief like having to consult documentation, and I like to believe that I am programming for as long as possible. The notable missing link to the underground world of Zork is "look". Here is "look". It is nothing more than a mashup of introspection methods but I've found it indispensable.
# Examples:
# foo.look
# foo.class.look
# foo.method(:meth).look
# foo.method(:meth).source_context
# foo.instance_variable_hash
Object.class_eval do
def instance_variable_hash
ret = {}
instance_variables.each { |v|
ret[v] = instance_variable_get(v)
}
ret
end
def look
{
:class => self.class,
:interesting_methods => (methods - Object.new.methods).sort,
:instance_variables => instance_variables.sort,
}
end
end
Class.class_eval do
def look
{
:class => self.class,
:ancestors => ancestors,
:included_modules => included_modules,
:interesting_methods => methods - Class.new.methods,
:instance_methods => instance_methods(false),
:class_variables => class_variables,
:instance_variables => instance_variables,
:constants => constants,
:source_files => methods(false).collect { |m| method(m).source_location.try(:first) }.compact.sort.uniq,
}
end
end
Module.class_eval do
def look
{
:class => self.class,
:ancestors => ancestors,
:included_modules => included_modules,
:class_variables => class_variables,
:interesting_methods => methods - Class.new.methods,
:instance_methods => instance_methods(false),
:instance_variables => instance_variables,
:constants => constants,
}
end
end
Method.class_eval do
if RUBY_VERSION < "1.9"
def source_location
end
end
def look
{
:class => self.class,
:arity => arity,
:name => name,
:owner => owner,
:receiver => receiver,
:source_location => source_location,
}
end
def source_context
return nil unless source_location
line_number = source_location.second
file = File.open(source_location.first, "r")
lines = file.lines.drop(line_number - 1)
indent = lines.first.scan(/^ */).first.length
end_marker = Regexp.new("^" << " " * indent << "end")
interesting_lines = []
begin
line = lines.shift
interesting_lines << line
end until line =~ end_marker or interesting_lines.count > 20 or line.nil?
file.close
source_location << interesting_lines.join
end
end
Plays well with 1. Wirble, 2. pretty printing, 3. Ruby 1.8, 4. Ruby 1.9, and OF COURSE your .irbrc.
GLib Interface Performance with Vala
September 7th, 2009
As I factored more class commonalities into interfaces I began to wonder about their runtime cost. A study of gtype.c and gtype.h revealed that glib appears to be doing quite a bit of work to find an interface method so I decided to benchmark them.
Beginning with a type instance glib must lookup the class, then binary search an array of interface entries for the id of the wanted interface. Once it finds a match it can finally dispatch via the vtable that is part of the interface belonging to that entry. This is much more work than a virtual call but it can be well worth the trouble. I should add that glib isn't a stupid implementation. It must do all this work because of design compromises. For comparison, the C++ version of an interface bloats each instance with a vtable pointer in exchange for fast method lookup. Calling a C++ interface method shouldn't be noticeably slower than any other virtual function since that's exactly what it is. There is some fixup arithmetic for multiple bases classes but arithmetic is pretty cheap. In contrast, an instance of a GType with interfaces does not own any interface vtables. Those belong to the class, the concrete type.
Assume only the interface type of an instance is known at compile time and the concrete type is unknown. Different concrete types are free to implement any number of interfaces and therefore an interface entry may have different offsets within each class. Given an interface id and an unknown instance, the only way to find the interface entry that corresponds to both that instance and the interface id is to search the interface entries in the instance's class for the id. If the id is found, bingo, the entry's interface's vtable is used for dispatch. In the classic space/time trade we spend time to buy space.
Here's a quick interface benchmark written in Vala.
using GLib;
// Our interface
public interface Foo : Object
{
public abstract int ifrob ();
}
// And a factory method so clients never touch
// the concrete type.
public Foo makeFoo ()
{
return new FooC ();
}
// Finally a concrete type.
public class FooC : Object, Foo
{
// Foo interface method
public int ifrob () {
return 1;
}
// virtual method
public virtual int vfrob () {
return 1;
}
// regular method
public int frob () {
return 1;
}
}
// Number of method calls
const long n = 10000000;
void main ()
{
// Interface var
var fi = makeFoo ();
// Concrete type
var fc = fi as FooC;
var timer = new Timer ();
print ("GLib Dispatch Mechanisms\n");
print ("%ld iterations\nTimes in seconds\n\n", n);
timer.start ();
for (long i = 0; i < n; ++i) {
fc.frob ();
}
print ("direct: %f\n", timer.elapsed ());
timer.start ();
for (long i = 0; i < n; ++i) {
fc.vfrob ();
}
print ("virtual: %f\n", timer.elapsed ());
timer.start ();
for (long i = 0; i < n; ++i) {
fi.ifrob ();
}
print ("interface: %f\n", timer.elapsed ());
}
Here's the output, compiled with --Xcc='-O3'. GLib is compiled with '-O2'.
GLib Dispatch Mechanisms 10000000 iterations Times in seconds direct: 0.000001 virtual: 0.060368 interface: 0.450858
See how -O3 inlines the direct call and then removes the useless loop. You gotta love C compilers, gcc 4.4.1 in this case. Direct calls take about 0.075 seconds with optimizations off. But look at the virtual vs. the interface call. It's over seven times slower. I added some dummy interfaces to FooC in different orders but there was virtually no effect compared to the hit for using an interface in the first place. The disassembler revealed a call to fixed offsets of eax in both virtual and interface cases so the only thing getting optimized out by the compiler is vala boilerplate. The real work is still there. '-O2' yielded interface calls approximately six times more costly than their virtual cousins.
In summary, expect interface calls to be six or seven times slower than a regular virtual one. Interfaces are still sort of new in GLib. Lets hope they get faster because I really like using them.
Comparing Schemes of Dynamic Dispatch in C
May 4th, 2009
Schemes of Dynamic Dispatch under Study
The performance of three dynamic dispatch schemes were examined. I also pontificate on the merits of my favorite schemes.
Virtual-table: Objects contain a pointer to a function table. Each method has a fixed offset within the table.
Function-table: There is one table for each method. Objects contain an integer corresponding to the object type that is used as an index in the table.
Switch-statement: Objects contain a type-integer like they do in the function-table scheme. The switch executes the correct function depending on the integer key.
Virtual-table dispatch is used in classed OO languages because the vtable corresponds to a class. It is a data-centric approach because the dispatch mechanism is associated with the data type and not with any specific function type. Method names in the vtable can be overloaded in different struct definitions of the vtable. Extending a copy of the table corresponds well to single inheritance.
The latter two schemes are more similar to each other than they are to the first. They are function-centric because the dispatch mechanism is associated with a function type rather than a data type. An object passed to these dispatchers contains an integer type-id that serves as an index in a dispatch table or a key in a switch statement. Objects in this scheme are unclassed, though there is nothing prohibiting an association on the type integer with a class-like structure. This class-like structure is actually one interpretation of the dispatch table.
Variations tested:
- Virtual-table.
- Function-table at a static address.
- Function-table at a dynamic address.
- Function-table with fix-up subtraction.
- Switch-statment with inline method definitions.
- Switch-statement with few, ordered keys.
- Switch-statement with few, dispersed keys.
- Switch-statement with many, dispersed keys.
- Direct call (for comparison).
The essence of each scheme is in the call instruction. '-S -fverbose-asm' is kind enough to show the names of static addresses.
Virtual table
| call *16(%edx)
The constant 16 corresponds to the offset of the method in the table. %edx holds the address of the table.
Static function table
| call *func_table-3996(,%edx,4)
The constant *func_table is the static offset of the function table. %edx contains the integer corresponding to the object's type. The constant 4 is the size of a pointer on my machine. Finally, 3996 is the index offset multiplied by 4. Index offset would be 0 for a table beginning at type 0, but a subtype might be found in a sub-range of indexes. Rather than build a sparsely populated table for this subtype, a smaller table can be used with a fix-up subtraction computed at compile time.
Dynamic function table
| call *(%edx,%eax,4)
%edx contains the address of the function table. %eax again contains the object's type-id. Fix-up subtractions would obviously need to be computed at run-time. I didn't test this combined with a fix-up subtraction because I consider it pretty useless anyway. I included this test for completeness since a virtual-table is inherently dynamic. Only the method offset is fixed in the virtual-table.
Switch statements
| call compiler black magic
It should be noted that this scheme semantically performs two direct calls for each dynamic method call. The implementation could be placed in the switch to eliminate function call overhead but it may be beneficial in some cases to have separate implementations and treat the switch statement like an interface declaration. In-lining removes the double-call overhead and can be counted on within a compilation unit.
Benchmark Implementation
Most popular benchmark algorithms do not depend on dynamic dispatch. I just did the naive thing and called a function recursively. It can be tricky to fool gcc into making a wheel-spinner that does no real work. For example, it is capable of rearranging the statements of a direct call that is not tail recursive into one that is, then unrolling it, then placing it in-line in the caller.
Timing points are in the code because instrumenting the executable for profiling caused the execution time to double. The key to using gprof might be to turn off instrumentation of the function calls under test and just time the top-level. However, measuring with the real-time clock gives results that are consistent enough to trust.
The Results
Times are in microseconds.
-O0 Classed virtual: 1404442 Function table: 1501014 Function table with subtraction: 1559695 Function table dynamic: 1607794 Switch inline code: 1823262 Function switch: 2612194 Function switch large: 3633052 Function switch dispersed: 3003073 Direct call: 1348344 -O1 Classed virtual: 1506692 Function table: 1506174 Function table with subtraction: 1506311 Function table dynamic: 1506651 Switch inline code: 1311783 Function switch: 2588107 Function switch large: 3026327 Function switch dispersed: 2549603 Direct call: 1368460 -O2 Classed virtual: 1505832 Function table: 1505468 Function table with subtraction: 1505384 Function table dynamic: 1506625 Switch inline code: 1376533 Function switch: 1982367 Function switch large: 2518238 Function switch dispersed: 2046317 Direct call: 1366636 -O3 Classed virtual: 1505480 Function table: 1506294 Function table with subtraction: 1505536 Function table dynamic: 1506019 Switch inline code: 1476872 Function switch: 1476428 Function switch large: 2517948 Function switch dispersed: 1647105 Direct call: 516581 -O3 -fomit-frame-pointer -mtune=native Classed virtual: 1379984 Function table: 1384040 Function table with subtraction: 1380380 Function table dynamic: 1505872 Switch inline code: 1439144 Function switch: 1438757 Function switch large: 2248245 Function switch dispersed: 1506101 Direct call: 509054
These represent a single run. I computed no averages and made no attempt at statistics. However, on repeated runs the times were consistent to within two milliseconds, which is significantly smaller than than the differences between dispatch schemes.
Conclusion, extrapolation, even some guessing and recommendations
Small switch statements do better than large ones, and dispersion of keys has only a small negative effect. For this reason it may be beneficial to reserve switch statements for dispersed keys while using table dispatch for grouped keys. The switch can actually best the table for a small number of keys but then its advantage depends on compiler in-lining or placing implementation code inside the switch. Even if the compiler succeeds in translating the switch to a jump table the double call elimination still depends on in-lining. A macro definition for explicit table dispatch is short and has guaranteed pretty-good performance. It could easily be converted to a switch, or even a switch macro, for the absolute best performance.
One technique might be to treat switch statement/function-table dispatch as one would an interface. Methods like "show", "display", "draw" and other things that usually go in base classes as stubs are good candidates for an interface.
Methods that need to be overloaded for each type definitely benefit from the virtual-table since the virtual-table can contain different types for the method name across different class structures. The advantage of the virtual-table is overloading and inheritance, but if a function is overloaded at the call site then the object type is probably known, and if its type is known then it should be a direct call anyway. Virtual-tables may also have a performance advantage for types that are dynamically created since the table is found through a pointer regardless. The performance of type-indexed table dispatch theoretically relies on having a static table address but my data doesn't show any difference in performance with optimizations turned on. Only -O0 shows any significant difference between method-indexed virtual-tables and different schemes of type-indexed tables.
Large switch statements with dispersed keys perform poorly but this case should not often arise in practice since type-id integers can be assigned sequentially. It is notable that the widest range in performance occurs in switch statements because they give the compiler the most room for optimization. These numbers show that a switch on type-id is the fastest possible dispatch scheme if implementation code is placed in the switch rather than in a separate method. But that's messy. In-lining in -O3 is capable of removing some of the function call overhead and the direct call-switch-call performs slightly better than a function-pointer. The best number here is 200,000,000 calls in 1376533 microseconds for a switch with explicit in-line code compiled with -O2. -O3 brings the performance down to 1476872 microseconds, which is nearly equivalent to the switch that calls a function for each case. The performance is identical in -O3 because of gcc in-lining. With some different tuning parameters the virtual-table and function-table are just as fast.
Personally, I prefer the interface style that jives with the type-indexed schemes. They are interchangeable, don't need a class, and perform as well as virtual-tables. The style lends itself more to multiple interface inheritance, similar to Haskell type-classes with monomorphism turned off. If a type (which is an integer index) has interfaces A and B (which are table or switch dispatched functions), then it may implement C (which is dispatched the same way).
This excellent style of multiple inheritance can also be done in C++ with one pure virtual base class for each set of functions that comprise an interface. But because C++ uses vtables, each object must keep a pointer for every interface from which it derives. I can't help but think a type-id implementation would make more sense in this case too. Tnen each object would have exactly one type-id and every pure virtual base class would be a table of implementations for those type-ids. Maybe I'm ignoring practical issues like linking and horribly sparse tables, but what about the practical issue of keeping objects small? Virtual tables yield big objects and small tables. Type-ids yield big tables and small objects. I ran this same test with C++ and the performance matched the vtable here. Automatic casting to the interface reference didn't appear to affect performance.
Vala is Awesome
December 11th, 2008
The GUI Dilemma
Linux has many language implementations but I keep finding myself driven to C by quality documentation and faithful implementations. Python, Ruby, and Scheme are all great but docs for bindings are second-hand and I usually need to pull out the original library reference or look at C headers and test-cases to accomplish anything.
If I want to port to the maemo platform I have to port the bindings too. Just more version hell.
On top of that there is interpreter lock with internal threads and that isn't good for the GUI. Loading a GtkListStore is already slow enough when calling from C but doing it in an interpreted language means an inner loop with additional marshaling. Database extensions should also be written as efficiently as possible because that makes them more flexible as I don't have to tiptoe around them in SQL; worrying whether my slowness is getting called 10 or 10e6 times per query. Then there's callbacks, like for sorting, which are called nlogn times or worse. If I'm writing asynchronous IO, ListModel loading, database extensions, and sort callbacks in C then there's very little left to do in an interpreted HLL.
I hate GObject C but I like GObject enough to wish I had used it for some things that are slowly turning into their own private spaghetti hell. Refactoring creates even more opportunities for error and is time consuming. C is hard, let's go shopping :)
C++ would be a good option if I had a much higher IQ. I can use existing classes but writing a new class without memory leaks, using it properly with exceptions, and looking out for excessive copying is a mental exercise in itself. That's actually a languages strength, lots of meaning in a line, but it's not an abstraction as much as it is a complex macro expansion. It's a fun language but it doesn't offer the blissful ignorance I often want.
Introducing Vala
With Vala the blissful ignorance is there for the taking, up to a point of course. It is a modern but slightly crippled language that steals as much as possible from C#. It has all the OO features of GObject, including singe inheritance, multiple interfaces, runtime binding, and dynamic properties. Destructors and constructors are supported along with private/protected/public members. There is also a limited "compact" class that avoids GObject for better performance. Structs and types built on machine words like int, float, and bool do not have OO features and have an exact correspondence to C. There is a string type built on GString that supports some useful operators like + and +=. Type inference means variables can be declared simply with the "var" keyword. The type-system also differentiates between nullable and non-nullable types and checks GError exceptions at runtime. Garbage collection is accomplished via reference counting, which is good enough for most things. Vala supports generics with the familiar template-like syntax. All generics appear to be implemented with gpointer, i.e. (void *), instead of creating new code and structures for each instantiated type. For example a Point2<int> p = {1,2} would create a struct containing two gpointers. Function delegates can be defined inline and look like a lambda. Currently, only formal parameters of the lambda may appear in the body. Vala accepts full lexical closures syntactically but the generated C contains undeclared identifiers and does not compile. Vala will most likely implement ref-counted closures in the future via GClosure, or so I hope.
The language compiles through C and anyone familiar with GObject should be able to read the C output to easily figure out what's going on. I wouldn't hesitate to use a Vala library from C because it's really just GObject under the hood. Bindings are easily implemented in a separate "vapi" file. There is no documentation for vapi at the moment but it really doesn't need much. Marshaling is delegated to GObject at runtime and calling GObject code from Vala doesn't cost any more than calling from C. The whole language is actually syntax sugar for C. It's C++ reinvented with the benefit of C# and Java in the rear-view mirror.
GObject has a high runtime cost but the Vala designers have created some very useful pockets in the type system to mitigate that issue. Vtable lookup always occurs for interface methods, unfortunately even when the concrete type is known at compile time. Inherited methods are called statically through upcasting so better performance is possible by restricting OO usage to compact classes and inheritance-only method resolution. The pointer overhead in generic structures would probably offset any performance increase from static method binding unless it is used carefully. Unrestricted use of generics would imply using them in interface constraints and those entail virtual method calls anyway. The language also doesn't leverage stack allocation for objects like C++ so instantiation requires heap allocation. Structs have access-controlled members and are normally stack allocated but copy constructors and destructors are still generated for structs to support full-object members. The compiler currently accepts inheritance and interface syntax for struct declarations but ignores it when generating code.
The roadmap is set for a 1.0 release in March 2009; fingers crossed. Vala is promising because it's conceptually simple, as portable as glib, has decent performance, and plays well with C in both directions.
PS. I should warn the reader that I just discovered Vala yesterday and my knowledge of it is limited to examining the C output of the version 0.5.2 compiler. Some of this info is probably wrong so this article is As-Is, No Warranty. However, I'd appreciate a comment if I failed to understand something about Vala.
Haskell Music
January 22nd, 2008
As part of my effort to seek enlightenment through Haskell I’ve made some music. It’s actually sort of neat creating a waveform that when played through a DA resembles something other than white noise. This isn’t controlling a sequencer – it’s just composing functions to make make a sound representation. It’s a lot like naive ray tracing where all objects are considered for each pixel in that for each sample I sum the waveforms for all notes, many of which are zero since a note only plays for a short duration and is otherwise silent.
Notes are made by transforming a sound frequency Signal (Double -> Double) with pitch and amplitude Shifters (Signal -> Signal). For the note envelope I create a signal that represents amplitude and modulate the sound signal with the amp combinator, which is a Modulator (Signal -> Signal -> Signal).
One thing missing from this implementation is the ability to write (signal = signal1 + signal2). I now would write (signal t = signal1 t + signal2 t,) since a signal is a function over time. mauke in #haskell (irc.freenode.net) was helpful but I still don’t understand why some things work the way they do so I chose not to write a (+) instance for signals until I understand it better. The following will play a few measures of Mozart’s something something as I remember it. I forgot what it’s called but it’s in C and it’s popular.
Here’s an ogg vorbis of the output: mozart.ogg
And the illiterate Haskell.
-- Compile and listen as follows
-- ghc -o mozart mozart.hs && ./mozart | mplayer -v -rawaudio rate=16000:channels=1:samplesize=1 -demuxer rawaudio -
import System.IO
import Data.Char
type Signal = Double -> Double
type Shifter = Signal -> Signal
type Modulator = Signal -> Signal -> Signal
main :: IO ()
main = putStr . toBytes . take 200000 . sample 16000 $ sound
sound :: Signal
sound = ampc 0.35 . shifterSum song . pitchc (440 * hstep ** 3 * 2 * pi) $ wave
song :: [Shifter]
song = zipWith (\e p -> amp e . pitchc p) (lhEnvs ++ rhEnvs) (lhPitches ++ rhPitches)
wave :: Signal
wave = flange (\x -> sin (x * 2)) sin
lhPitches :: [Double]
lhPitches = map (* (1/2)) $ [du,so,me,so,du,so,me,so,re,so,fa,so,du,so,me,so] ++
[du,la,fa,la,du,so,me,so,re,so,fa,so,du,so,me,so]
lhBegins :: [Double]
lhBegins = take 32 $ toTempo [1..]
-- Applies 0.1 second duration to left hand begin times
-- resulting in a list of complete amplitude envelopes
-- for each note.
lhEnvs :: [Signal]
lhEnvs = map (pianoEnv 0.1) lhBegins
rhPitches :: [Double]
rhPitches = [du,me,so,ti/2,du,re,du, la,so,du*2,so,fa,me]
rhBegins :: [Double]
rhBegins = toTempo [1,5,7,9,12,12.5,13, 17,21,23,25,28,29]
rhEnvs :: [Signal]
rhEnvs = map (pianoEnv 0.5) rhBegins
-- attack decay sustain release to resemble a piano
-- apply duration and begin for complete envelope signal
pianoEnv :: Double -> Double -> Signal
pianoEnv = env 0.005 0.05 0.4 0.7
-- eigth note tempo
tempo = 160
toTempo :: [Double] -> [Double]
toTempo ts = map (\t -> 60 * t / tempo) ts
-- Sums list of signal shifters and mutes the original signal
shifterSum :: [Shifter] -> Shifter
shifterSum xs = foldr (shifterAdd) (const . const 0) xs
shifterAdd :: Shifter -> Shifter -> Shifter
shifterAdd a b input t = a input t + b input t
-- 1 b
-- / \c_____d
-- / \
-- 0 ___a/ \e___ Makes an amplitude envelope for a note
--
-- Will div by zero in some cases.
env :: Double -> Double -> Double -> Double -> Double -> Double -> Signal
env attack decay sustain release duration begin t
| t < a || t >= e = 0
| t < b = (t - a) / attack
| t < c = 1 - (1 - sustain) * (t - b) / decay
| t < d = sustain
| t < e = sustain - sustain * (t - d) / release
where a = begin
b = a + attack
c = b + decay
d = c + duration
e = d + release
sample :: Double -> Signal -> [Double]
sample rate signal = [signal (t / rate) | t <- [0..]]
-- yuck
toBytes :: [Double] -> [Char]
toBytes = fmap $ chr . clamp 0 255 . (\n -> truncate $ (n + 1) * 127)
clamp min max x
| x < min = min
| x > max = max
| otherwise = x
pitchc :: Double -> Signal -> Signal
pitchc mult input t = input (t * mult)
ampc :: Double -> Signal -> Signal
ampc mult input t = mult * (input t)
amp :: Modulator
amp control input t = (control t) * (input t)
-- only works for control where d/dt = 0
pitch :: Modulator
pitch control input t = input (t * control t)
flange :: Modulator
flange ff w t = w (t + ff t)
hstep = 2 ** (1/12)
du = hstep ** 0
re = hstep ** 2
me = hstep ** 4
fa = hstep ** 5
so = hstep ** 7
la = hstep ** 9
ti = hstep ** 11
The Best Tutorial - Write Yourself a Scheme in 48 Hours
December 26th, 2007
Write yourself a Scheme in 48 Hours
This is the most excited I’ve been since I learned how to make a character bounce around the screen on my C64! Learning a new programming style is hard: big programs are difficult to understand and small programs have too little to tweak. How do I learn from a ten line tutorial or a one-hundred line program? The ten line tutorial doesn’t do anything interesting and forces me to write new code to grow. OTOH, one-hundred lines of Haskell usually do something neat but I can’t begin to untangle how. For everyone there is a threshold of complexity, the understanding of which enables creating more complex things. Once this threshold is achieved a person can bootstrap their learning as an autodidact. Below it there is neither enough knowledge to guide their study or test their knowledge for correctness.
My capacity for understanding is often far lower than the bootstrapping threshold and my learning stalls as I fumble through tutorials gaining one small bit of knowledge without correlation until one day things begin to click into place. I then wonder how I learned in eight hours what had stumped me for eight months and feel profound appreciation for the wizards that understood it first. It’s like I was blind and they could see, and now through some miracle I also see. It’s real sight too, not just a description in words but the real sunrise seen for the first time by the formerly blind. In this case the misfitting pieces were the static algebraic types, monads, type classes, currying, and lazy evaluation of Haskell. It’s quite a bit to swallow just as the more familiar loops, conditional statements, functions, and variables once were. Many of these things are interdependent and cannot be understood separately. Jonathan Tang begins small and works from the ground up to a larger program, explaining each change so every line’s purpose is thoroughly understood. The culmination is a very small but capable scheme interpreter that is larger than what I would have understood without my building it. It is a perfect demonstration of the building-block technique of instruction I learned and used as a flight instructor.
Mr. Tang put the pieces together concisely. Wow.
Xmonad
December 13th, 2007
Xmonad is an awesome window manager and these are my complements as a fanatical user. The thing I love about Xmonad is that it just seems to get everything right. Wmii, my first favorite tiling WM, really got me interested and I found no fault with it until I used Xmonad.
Internally Xmonad manages focus with a zipper data structure. Though there are more naive approaches used in early versions it’s clear from maintainer Don Stewart’s blog that a lot of thought went in to my sanity as a user. The sorted order provided by the zipper allows for completely automatic management of my windows. For me, the order reduces my workspace to essentially one dimension. In Wmii I used vim key-bindings to move focus up/down/left/right. In Xmonad I use only two keys for this. In every other window manager Alt+Tab is a total pain due to the random order of the windows or the fact that they need to be manually sorted. Xmonad always inserts and deletes windows in a way which preserves order as if I were inserting or deleting lines from a text file. Better, Xmonad can be extended with, among other things, algorithms to arrange windows based on this order. There are tabbed, grid, and circular layouts to name a few. One interesting extension even arranges windows in a Fibonacci spiral.
There’s plenty of fud floating around about the inextensibility of purely functional programs but Xmonad proves otherwise, at least for Haskell’s case. I feel like a dog watching TV when viewing the source but I do know this; the extensions manage a variety of state and the code is concise. Oh, and the whole thing runs in a memory image a little larger than an instance of bash.
Special thanks to Spencer Janssen for authoring the project. Xmonad is doing everything just right so my Archlinux machine is polished like a Macintosh, only more useful.
Arch Linux vs Ubuntu
November 17th, 2007
Yes! Arch is great. For more detail read on.
Arch's Appeal
The main difference between Arch and Unbuntu is that Arch is designed to be molded whereas Ubuntu is a complete desktop/server distro. Like Ubuntu, most things were configured and worked upon install. Unlike Ubuntu, Arch does not have a default desktop environment although XFCE4 could be considered well supported. Arch will appeal to users who wish to start with a minimal system and a package manager, adding bit by bit until the system is tweaked to their desire.
Ubuntu's Appeal
Ubuntu is for those who just don't want to mess with it - giving users the choice of a server, or a Gnome / KDE / XFCE desktop. There is also a minimal (alternate) Ubuntu installer for those who want more control. I wont say Ubuntu 'wins' here or anywhere because it's apples and oranges. Yes - Arch vs. Ubuntu is a stupid comparison but if you're like me and you can't help but type X vs. Y into Google before doing anything then maybe identifying some differences isn't so bad. In short, Arch maintains bleeding-edge, occasionally broken, packages with no release cycle whereas Ubuntu needs upgrading (breaking) every six months. Ubuntu is good software with target markets. Arch is good software.
Package Management
Arch uses a rolling release cycle where major releases consist of a snapshot. For a personal computer, avoiding Ubuntu's six month upgrade cycle is a definite benefit. On the other hand it could cause breakage occasionally, which is why I still prefer Ubuntu (LTS) for my server. Arch packages tend to be more bleeding-edge from what I can tell browsing the repository. For example, the packaged version of GHC in Ubuntu 7.10 is 6.6.1 while Arch will install 6.8.1.
Pacman, the Arch package manager, is similar to apt. One thing I really like is that headers are bundled with library binaries. This makes sense since headers have a tight coupling to the code they describe. They can also be used as documentation when man pages are weak. With Ubuntu I had the option of not installing headers but that proved to be sort of a PITA when compiling or building a new development system. Ruby gems also uses gcc occasionally and it's nice to avoid the search for *-dev packages. pacman -S imagemagick && gem install rmagick just worked.
Post Install
Everything is installed, configured, and working as per the normal Ubuntu setup. Arch requires a few extra steps to get a desktop environment installed. Here's a few examples.
Xorg
Unfortunately I had to configure X. I say, 'unfortunately', because I hate configuring X. Best to just install Ubuntu on another partition and copy the xorg.conf. Currently I'm using the open-source ATI drivers since I couldn't get the proprietary ones to install properly. I'll be working on that later.
Sound
Running alsaconf asked a few questions and my sound card worked. Easy.
Suspend and Resume
Suspend and resume with swap and ram work great using the pm-utils package. I had to configure the resume scripts to run alsactl restore to get sound working after a restore but other than that it was configured perfectly upon install.
CPU Frequency Scaling
Ubuntu Forums had a great post on this topic. Since it's dealing with the kernel it works on Arch too. http://ubuntuforums.org/showthread.php?t=248867
In Total
I'll be using Arch for the foreseeable future, maybe even contributing if I'm ever capable of fixing something. The documentation on their wiki is outstanding and the community has momentum. In general, Ubuntu is great for the automated installation and configuration of everything to include the kitchen sink. Arch is great for a lightweight and fast system with a rolling release cycle and bleeding-edge software.
Boehm Garbage Collector
November 10th, 2007
Toying with the Nokia 770 has been teaching me more than I ever wanted to know about C, gcc, and the other things we do to write software for handhelds. I had known about the existence of garbage collectors for C and C++ but had always assumed they worked similar to GObject. A GObject programmer must call g_object_ref() and g_object_unref() to manipulate the reference count of each object on the heap. When the count reaches zero the space is reclaimed. A Boehm GC, rather, can function as a drop-in replacement for malloc() and free(). To my amazement the free() procedure isn’t required at all and can be replaced with a noop. The word “automagically” doesn’t even annoy me in this case since the Boehm GC is quite a trick. It can determine when an object on the heap is no longer required by comparing places where pointers usually live with it’s internal book. For example, a function that calls malloc usually sets a pointer on the stack to the allocated memory. When the stack shrinks the reference is gone and the dead memory can be automatically freed.
The main benefit, for me anyway, is that I can write consing functions in C. Any linked list implementation has a foreach method. Foreach is easy because it doesn’t have to keep the return value of its lambda argument. Map, filter, and almost every other good thing that comes from functional programming requires consing up a new list. Allocating the right amount of memory for each return value is doable. The difficult part is the bookeeping required to later free all the unreferenced lists. A functional program might map, filter, then reduce in a single line as it takes the GC for granted. Manual allocation is a better fit for building fewer, less disposable structures. I’m not sure I’ll use map that much with C but it would be a good way to test-drive the Boehm GC.
There has to be a catch. I’m still wide eyed that the Boehm GC even works. It might be a sign that Mozilla and X, the two biggest hogs on my system, use it.
Free Computer Science Courses
October 23rd, 2007
This is awesome: aduni.org
There are twelve courses complete with video lectures, handouts, problems, and exams. So far I’ve had time to view the first six lectures in Theory of Computation covering regular grammars, finite state machines, context-free grammars, pushdown machines, Chomsky Normal Form, and a whole bunch of things I’ve forgotten. I can’t yet wield lex or yacc like a pro but understanding some of the rigor behind scanning and parsing really opens the door to some neat things. Other courses cover discrete math, probability, networks, algorithms, memory, pipelining, object orientation, web programming, artificial intelligence, and of course the “Structure and Interpretation of Computer Programs”.
Shai Simonson is an excellent lecturer. In a way it’s like watching Jeopardy but instead of trying to guess the correct answer-question, I try to guess what Shai will write next on the blackboard. I’m sure the other courses are just as interesting but I’ve only skimmed their contents as of yet. The materials are licensed under Creative Commons.
Templates are Pure Functions
September 11th, 2007
It is often stated that it is a good idea to exclude code or functionality from templates. I have seen more than a few articles on the web entirely about how to approach this goal and the solutions are usually somewhat ad-hoc or arbitrarily restrictive. Some projects even seek to build a template language on top of the base language. We are typically left with the impression that there are no rules, or even good guidelines, by which to judge the amount of functionality performed by the template. Code seems like a necessary evil. We need it, but too much is bad and how much is a black art. Here is a simpler definition of a template: templates are pure functions.
A pure function obeys the following rules.- The function always evaluates the same result value given the same argument value(s). The function result value cannot depend on any hidden information or state that may change as program execution proceeds, nor can it depend on any external input from I/O devices. This is known as referential transparency.
- Evaluation of the result does not cause any semantically observable side effect or output, such as mutation of mutable objects or output to I/O devices.
The singular purpose of a template is to transform it’s input into something viewable. If output does not depend entirely on input then the template is doing something it shouldn’t. This is boring and obvious to experienced functional programmers but, hey, we’re not all experienced functional programmers. Despite the theoretical simplicity there’s always a catch in the real world.
Automatic loading of composed objects muddies the pure function perspective. In this contrived example there is a user model composed of an address model with a street attribute e.g. user.address.street. The common practice, at least in rails, is to build a template which accepts the user model as input and chains to attributes of associated models. Does the output of a template using object chaining depend only on input? That depends on whether or not the requisite models were loaded before evaluating the template. If template evaluation caused a database hit then the output would partially depend on what was in the database at that time, breaking the pureness of the template. Automatic object loading is a hidden complexity that speeds development. Associations should be preloaded during production with a database join to prevent superfluous queries, a practice consistent with templates as pure functions.
There is no need to create a non Turing-complete subset of a language to build a proper template, nor is there a need to eliminate code from a template. The mantra that code doesn’t belong in templates is harmful and wrong. Templates may contain any code so long as the template remains a pure function.