Pike Conference 2011

This years Pike conference was held at Roxen's offices in Linköping, Sweden November 3 to November 5. Participants joined from Europe and North America. Among the Topics discussed where a new maintainer for pike releases, a change in the release versioning, and the new garbage collector as part of the multi-cpu support that is being worked on.
This is a report of the conference by Martin Stjernholm.

Grubba guided us through the list of changes in 7.9 since the last conference.
Tobias and Arne talked about the need for a preprocessor pass before the precompiler translates .cmod files to C.
After some discussion we arrived to the idea to make it possible to specify a prefix to the pike cpp() preprocessor. It would then recognize directives like #cmod_include, #cmod_define, #cmod_if, etc, so that it's easy to tell the two preprocessor passes apart while still keeping to the classic C syntax.

It must also be possible to disable builtin macros like __FILE__, __NT__, etc, so that they aren't expanded in the wrong pass. Also, some kind of magic define is needed to "quote" identifiers to keep them from expanding in the first preprocessor pass (had no good ideas for a name for that one, though).

Arne and/or Tobias will make an attempt at this, which is what remains to be solved before they can merge in their new CritBit module.

(One more thing that I thought about only now is that the pikedoc extractor ought to do the same cpp pass as well, and then also make it possible to get macro expansions inside the documentation somewhow.)
Tobias and Arne asked about the rebase policy in the git repo, and it was clarified that although rebasing rather than merging is the usual everyday method, it is not an absolute requirement. Branches and merges are fine when side projects are ongoing for some time and/or involves more than one person.
Bill talked about his attempts to separate resolver contexts, to be able to run different versions of an application in the same pike process. He would try to solve this with separate compilation handlers, similar to how the #pike compat system works.
On eMBees request we discussed the third party modules on gotpike etc, and how they could be disseminated better. Bill had some concerns about whether many of those modules actually is of a quality that merits wider distribution. I said that the modules that are decent should make it into the core dist, where they would be much more accessible.
Zino talked about the release process and the need for a replacement for himself. Bill agreed to be the new release coordinator while I took over the task of fixing Windows builds.
We hope that releases will get more frequent again now, with a new 7.8 release the short-term priority.
As for releases, we also talked about stepping up the versioning speed, so that majors, minors and builds again work as originally intended, and not like today when the minor is practically a major, and the major never changes at all. That means every release would increment the minor version, and there should be around 2-4 per year.
Even so, we're satisfied with having only two branches like today, so with this thinking the current branches would become simply 7 and 8 for the major versions, and on 7 the minor versions for each release would be part of the tags.

There can still be a need to make small fairly short-lived branches after a release, to fix serious bugs while avoiding new features already on the main branch. They can be forked afterwards as necessary, but the dist building tools may need some tweaking to make it easy to build dists from other branches (or really any commit).

The same goes for Xenofarm as well - it would be very useful to request a Xenofarm run on any commit, even if it isn't the HEAD of one of the major branches, or maybe not part of them at all. That would allow (trusted) people to try out their topic branches with Xenofarm.
Zino talked about what should be done to improve the pike packaging. The packaging in Debian and derivates is mostly fine, but we need to work on getting packages into the rpm-based dists as well. No one stepped forward to work on that, though.
We also talked about moving to github.com. The problem with that is that they do not permit commit hooks. We should try to improve the Pike presence there anyway. There's also the problem of finding a decent issue tracker to replace the old one at community.roxen.com. Launchpad and Trac were discussed.
Per showed a small but nifty little addition to hook in lambdas as defines in the preprocessor, which makes it possible to implement "magic" defines from Pike.
Per also mentioned his work on using builtin crc instructions on modern x86 cpus to speed up the string hashing. Would be a useful addition as well.
Grubba said a few words about his plans to condense the svalue struct into a single pointer, in which a bit flag would be used to tell pointers and integers apart. Floats would become structs passed by reference instead.
Per had concerns about that the length of the native integers get shortened by one bit, and we discussed an idea to introduce a struct for integers that are too large but still fit within the hardware size. That struct could then hold an unsigned integer so that full 32/64 bit unsigneds could be handled without resorting to full bignum objects. I am however not convinced that this intermediary form is worth the added complexity.
I talked about the new garbage collector that is part of my multi-cpu plan, what speedups it would bring (which I believe will be significant even without the multi-cpu change), and what problems there are in realizing it.
All this is basically as outlined in the repo, issues "Garbage collector", "Immediate destruct/free when refcount reaches zero", "Micro-gc", "Single-refcount optimizations", "Weak ref garbage collection", "Garbage collection and external references", and partially "C module interface".

We discussed in particular the problem with immediate destruct/free on function exit, i.e. the common idiom that code like this:
```
    void foo() {
      Thread.MutexKey my_lock = my_mutex->lock();
      ... do some work ...
    }
```
releases my_lock immediately on function return when it loses its stack reference.

There is still no really satisfactory solution for it; the GC overhead to handle it is significant (although not more so than with the current full refcounting approach), solving it through compiler analysis appears to be quite complicated, and leaving it as a change that pike programmers have to adapt their code to is of course also problematic.

The next steps to pave the way for a new GC are:
1. Add some macros that encapsulates the pointer and ref handling on the C level. All C and cmod code need to adapt these to allow the necessary changes later on. (The same goes for the shortened svalues that Grubba works on.)
  To be clear: There will be C level changes. All third party module developers will need to adapt their code. Once adapted, the code will be source level compatible with both current pikes and future versions.
2. Add new array functions to allow changing the length of arrays destructively (like the push, pop and slice functions in Javascript). Necessary since code like the following gets very hard to optimize to the destructive append it currently performs:
```
    array my_array = ({});
    while (...) my_array += ({another_element});
```
3. Similarly, string appending code gets hard to optimize in the destructive single-ref case. That's possible to work around with String.Buffer, but that's also slightly inefficient due to the extra function calls. It should be possible to fix the compiler to generate code that works on unfinished strings instead.
Comstedt and I looked into the problem with a buggy "@propagated_variables@" configure expansion that plagues some Monger modules. It turned out to be caused by too many build files coming as part of the module archives, which incorrectly override those from the installed pike.
One such file is aclocal.m4, which should not be distributed with the modules. In general, none of the files that are in the Pike include directory should be in the module tar.

Another problematic file is the generated configure script, which if sufficiently old may be inconsistent with the build files in the installed pike. Not distributing the configure script solves that, but otoh requires the user to have autoconf installed. This is probably mostly a problem with modules packaged for older pike versions.

Using the newfangled module packaging with completely standalone configure.in, Makefile.in, etc, should also avoid these problems.