I've been trying to steal time to do some long-promised work on dsource.org recently.  It was precipitated by stonecobra donating a server for our dedicated use.  It's a monster, with dual CPU's, 8GB memory and seriously fast mirrored disks.  This will get us out of the virtual host situation that, while an improvement over the duck tape and toothpicks computer and crappy home DSL, still leaves us wanting for more memory.

Along those lines, I would like to be able to not only weather a full site crawl by one of the search engines, but to thrive during it.  This means some heavy caching using Tango.  stonecobra has offered to help, so that's sweet.  We will probably make the interface look a whole lot like memcached so the Django portion of the site is relatively seamless, and then we have to crawl through Trac code for some caching love.

It seems like getting this server has triggered a series of dependencies, but I realize they are of my own doing.  For instance, I refuse to install php / mod_php / phpBB on this server...  That means that the following needs to be in place:

1.  Django registration:  done
2.  Profile management:  almost done (could do it in Django or reuse TracForums)
3.  Replace forums, and port all old forums, topics, posts, watch data, etc.: almost done

Pragma has resurfaced with a bit of prodding from me, and picked up TracForums development again.  We're both working evenings trying to get it polished, and fit into the new dsource software.  Once we do that, we convert all of the existing phpBB forum data into the new db tables, and deploy.  /me drools...

Here's a teaser:

 
 

Here's a brief recap of the events, and time-permitting, I'll go into more detail on some of them.

Day 1

Walter & Andrei talked about the future of D.
 * Struct ctors/dtors
 * some functional paradigm support
 * pure functions, don't modify data that's external, and thus are parallelizable
 * macros
Um... bad-ass  :-D

Sean talked about modularization of D, overloading the GC, etc.  Clearly this was way above my paygrade, and my head almost exploded on concepts that he eats for breakfast.  However, I am hopeful that Walter took a bit of notice of the nice separation and will look to improve his own house on this front.

Kris gave a great talk about slicing, and how holding onto references (slices) of the original hunk of memory instead of creating a whole bunch of temp vars, keeps the GC from thrashing, as well as alleviates the need to allocate new (heap?) memory.  His war stories about slow Java programs and how he improved them by avoiding memory allocation and GC thrashing were a great setup for a monster feature in D: slicing.  Text processing was an obvious example, but his work on the http server and clustering slice usage were even more powerful examples of this technique.  Oh, btw, he did most of this work *three years ago* in Mango.  it's just sitting there, waiting for someone to snatch it up w/ all its goodness.  Nice to have someone so obsessed with performance on our side.

Gregor and I talked about DSSS and DSource, and how the two will become quite close.  dsss net install packages will be created with a Trac admin page and test results for all installed packages.  My dsource part was quite fluffy compared to the deeper topics earlier in the day, so I attempted to give stats and status infused with a bit of humor.

In talks with others at the break, my stance on adding only one DSCM, git, was soundly trounced, and I will most likely be adding Mercurial as well.  I won't support more than two, but these are well-liked, come out favorably in a couple of comprehensive comparisons, and have Trac plugins.  Don't freak out:  svn is *not* going away.

Day 2

Kirk presented Pyd with a whole lot of code snippets.  He blew through the talk in 15 min, and then was peppered with questions on the concepts.  Of particular note, Walter seemed to be digging for compiler enhancements, __traits improvements, and what not, that would make Kirk's life easier.  FYI, Pyd is geared toward extending Python with D code when performance is needed.

Don presented some of his work on high performance code generation, akin to Fortran's BLAS lib for linear algebra.  Use of D's (admittedly dangerous) metaprogramming facilities, like string mixins, allowed him to get into the deepest loops of the algorithm.  The entire talk was accessible for a layperson like me.  I'm not so much a physicist, but I did stay at a Holiday Inn Express last nite.

Cristian presented work on his debugger, which looked quite nice.  Being able to stop execution and back up a step or two and having the state at that time be preserved could prove useful. 

BCS fried even more braincells with his template talk.  When you have a slide called 'chained recursion' with code that uses mixins of itself to recurse down, I think it's safe to say there's a wee bit of perversion going on with BCS, DMD, and the lang in general.  As an 'app dev' I'm not even going to pretend to have understood slides after this one ;)

I sat out Bartosh's talk on Software Transaction Memory, because I'm currently a 'Smug Erlang Weenie' and all about the message passing.  Plus my brain was full.

Walter and Andrei hopped back on the stage last, talking further about the future of the language:
 * AST macros (as opposed to C/C++ text macros)
 * new string literal notation
 * ongoing final/const/invariant work that was distilled down to list processing-like head/tail or car/cdr or first/rest type coverage of all pointer combinations.  Given the more robust definition of the problem, the joke was that another year of discussion was needed to 'get it right.'  Admittedly, it's important to get it right for parallelism and D world domination in general.
 * struct inheritance - interface enforcing presence of members
 * more things that I'll need to be reminded of, when the presentation is posted by braddr.

W. mentioned that all of this could take a year or more to develop, but string literals and const work were already in some semblance of completion.

While the talks were fun, interesting, and brain-taxing, the best part for me was meeting all of you criminals that I had only chatted with online or on the phone.  It's quite early (still) in this language's life, but a crazy amount of really smart people are hanging around, and that can only be a good thing.

Finally, Brad Roberts is a rockstar in my book.  He organized the whole conference and herded us kittens over and over, leading up to, and during the conference.  He was also willing to foot the bill for two lunches, one dinner, and all drinks & snacks during the presentations.  Happily, the powers at Amazon.com were so impressed with the obviously "real" and compelling conference being conducted in their building, that they agreed to reimburse Brad and contribute even more than the facilities to the conference.  Solid.

Well, that's it for a brief blog post. :-D

 
 

Kirk McDonald has been doing a fine job of maintaining the D lexer in Pygments, which is used to highlight syntax in dsource's wiki and code-browser.  And he gets some insane turn-around times by the Pygments SVN committers, unless he's one already.

In any case, a mere few hours after Walter had announced the new Traits feature, bringing a bit of compile-time reflection to D, Kirk had the highlighter updated.  http://www.dsource.org has been updated with this new version, so start using traits in your code.

 
 

I'm planning on getting a plumbing overhaul completed before the D Conference at the end of August.  The particulars involve using Django to serve everything on the site that isn't an individual project.  This will include the Home, Site, and even Project List pages.

In the past, I've used Trac to do this, and it resulted in some substantial modifications to the Trac codebase.  As they try to move to 0.11 and Genshi, I would like to get as much of this custom code factored out.  I believe it is my mods that expose the regexp bug in Python's _sre.c file, and causes the dsource server to hang every now and then.  So the switch to Django should be fairly positive.

In working with Django again, I remembered what a joy their template language is to use.  It doesn't incur the overhead of an XML parser, it supports template inheritance, and you can't call Python code or have side-effects at all.  These are good things, imho.  In line with this article, and StringTemplate & SGTE.

I also plan on getting rid of phpBB (finally) and moving to Eric Anderton's TracForums plugin in each project.  This means that I'll need a new Authn/Authz subsystem and I think I'm going to use Django's.  In fact, I'll be using Django 'user' model for auth in the dsource.org Trac plugins, and I might as well use dsource's 'projects' model in those plugins as well.  I'm pretty sure it will piss me off when I have to use Genshi instead of Django to do some things in Trac, but whaddayagonnado? 

My target is to get something onto the server by the end of July and then get some enhancements to the Projects List page by the conference.  We'll see how much the newborn lets me get accomplished ;)

 
 

There has been a long-standing issue with the dsource server beginning to eat all the resources.  I happen to believe it's in my modifications to Trac that allow dsource to host multiple projects and such, and will get on to fixing that real soon now.  Probably upgrade to Trac 0.11 and Genshi while I'm at it.

Maide and I used gdb to track it down to a probable bug in _sre.c, the Python regexp code.  It seems that an endless loop happens here, basically hosing the Apache / mod_python process (or if compiled with threads, the thread).

Still, even though we may have found a bug in Python (and we're looking to upgrade to Python 2.5 because the changelog says some work has been done on this front), it's not good to have introduced the code to Trac that exposed this.

So, until I can get some time away from \${dayjob} the issue remains.  What to do?  How about come up with a brutal hack that works, but is embarrassing.  This script basically parses 'uptime' and if the short and medium term usage items are over thresholds, we stop Apache, wait for it to die, and then restart it.  OMFG:

#!/usr/bin/env python

import commands
import os, sys
from time import localtime, sleep, strftime

DEV = False
MAX_ATTEMPTS = 24
LOGFILE = "/var/log/restarts.log"

def send_oh_shit_mail():
  SENDMAIL = "/usr/sbin/sendmail" # sendmail location
  p = os.popen("%s -t" % SENDMAIL, "w")
  p.write("To: admin@dsource.org\n")
  p.write("Subject: dsource screwed!\n")
  p.write("\n") # blank line separating headers from body
  p.write(":(\n")
  sts = p.close()
  if sts != 0:
      print "Sendmail exit status", sts

def stop_apache():
    result = commands.getstatusoutput("/etc/init.d/apache2 stop")
    print result


def wait_for_apache_to_die():
    cmd = "ps -ef | grep apache | grep -v grep | wc -l"
    count = 2
    attempts = 0
    while count > 1 and attempts < MAX_ATTEMPTS:
        attempts += 1
        result = commands.getstatusoutput(cmd)
        count = int(result[1])
        print "%s - apache instances: %s" \
            % (strftime("%a, %d %b %Y %H:%M:%S", localtime()), count)
        sleep(5) # seconds

    if count > 1:
        send_oh_shit_mail()

def start_apache():
    result = commands.getstatusoutput("/etc/init.d/apache2 start")
    print result

def write_to_log(msg):
    f = open(LOGFILE, 'a')
    f.write(msg)

def get_nums():
    uptime = commands.getstatusoutput("uptime")[1]
    nums = uptime[uptime.find("load average: ")+14:].split(", ")
    return [float(num) for num in nums]


def main():
    try:
        short, medium, long = get_nums()
        print short, medium, long

        if short > 3 or DEV:
            if medium > 2 or DEV:
                dt = strftime("%a, %d %b %Y %H:%M:%S", localtime())
                write_to_log("restarting Apache: %s %s %s - %s\n" \
                    % (short, medium, long, dt))
                stop_apache()
                wait_for_apache_to_die()
                start_apache()

    except Exception, e:
        sys.stderr.write("error: %s\n" % str(e))
        sys.exit(1)

if __name__ == "__main__":
    main()

 
First Post! 06/12/2007
 

Um, hi.

So it's mid-2007 and a self-proclaimed geek like me is just getting a blog?  What's up with that?  What the hell have I been doing?  Well, there have been a few things...

At Mirus, and at Sankaty before being acquired, we are building a SaaS Business Intelligence company.  I'm not sure when you stop calling yourself a startup.  I guess you stop when you're making your own money.  We've been doing that (only took 8 years), but maybe you drop the startup tag when you're making enough to pay everyone closer to market value than the sweatshop wages you have been paying.  In any case, the experience has been phenomenal, from technology to corporate structure and finance, I have a wee bit to offer the world now, and hope to share intermittently in this blog.

My sidebars have been investigating new programming languages.  I came from the VB world, and was elated to 'get out.'  We web-enabled the product above with Java, and that was okay for the most part.  I briefly studied K&R to learn C, but then started to hit some languages that I really enjoyed.

 * the D Programming Language - I liked it enough to start the language's version of sourceforge.  See http://www.dsource.org.  I will also be speaking at the first D Conference.  Gotta think of something interesting to say...

 * Lisp - quite the eye-opener.  We attempted to rewrite Mirus' BI query engine in Lisp, but my developer's minds weren't as twisted as mine.  They felt more productive in Java, and I got the usual pressure from the higher-ups that it's easier to bring people off the street with Java skills than Lisp.  Pfft.

 * Python - great language, and I've totally dissected the Trac codebase while learning it.  Dsource uses Trac and a lot of custom plugins to operate all the projects it hosts.  Django is pretty bad-ass as a web framework as well, and I hope to contribute some code (enhanced db backend) to that project real soon.

 * Erlang -wow, this language has just been sitting there, waiting patiently with all its goodness, for someone to come along and do amazing things with it.  That someone will hopefully be me ;)  Concurrent, fault tolerant, distributed, functional...  /me drools.  So far, there's one web server, Yaws, that can kick the crap out of Apache.  I'm wondering if Erlang is as well suited as it appears to be for a cometd server.  That'd help out some people with traffic between their servers and the browser, eh?  All while saving them hardware moolah because of the stupid amount of processes Erlang can run.  Maybe if I do this for someone, they get to pay the developers more?  Maybe if I use it myself, I get to break-even faster?

Finally, and most importantly, the family has been growing over the past few years.  Nora is almost three, and she gets a baby sister in about two weeks.  gulp...

Wow, so I'm new to blogging, and this could have been divided up into 18 different posts.  That's coming...  I just had to braindump to get started.

Cheers