All entries tagged "type=post".

« newer Page 2 of 2

PHP Snippet: Get all permutations

<?php

/**
 * Returns all possible permutations of $values containing $n elements using a
 * "draw and place back" algorithm
 *
 * The resulting array will always have pow(count($values), $n) entries.
 *
 * For
 *   $values = array('a', 'b') and $n = 2,
 * the result will contain:
 *   [aa, ab, ba, bb]
 *
 * @param array $values Vector to generate permutations of
 * @param int $n Elements per permutation
 * @return array Possible permutations
 */
function getPermutations(array $values, $n)
{
    $rec = function (array $values, &$ret, $n, array $cur = array()) use (&$rec)
    {
        if ($n > 0) {
            foreach ($values as $v) {
                $newCur = $cur;
                $newCur[] = $v;
                $rec($values, $ret, $n - 1, $newCur);
            }
        } else {
            $ret[] = $cur;
        }
    };

    $ret = array();
    $rec($values, $ret, $n);

    return $ret;
}



$values = range('a', 'd');
$n = 2;

$ret = getPermutations($values, $n);

foreach ($ret as $r) {
    foreach ($r as $d) {
        echo $d;
    }
    echo '<br>';
}

Nov 18 2010 • by Marc Ermshaus • type=post language=en php0 comments

php.de Mitmachquiz: Gästebuch und Kontaktformular

Im php.de-Forum gab es vor einiger Zeit unter der Bezeichnung „Mitmachquiz“ den Versuch, speziell Programmieranfängern eine Gelegenheit zu bieten, in Eigenregie ein vorgegebenes Mini-Projekt umzusetzen. Die Idee dahinter war, im Stile von Code Reviews durch das öffentliche Diskutieren der verschiedenen Lösungen einerseits eine Art „optimaler“ Vorgehensweise für ein oft nachgefragtes Themengebiet herauszuarbeiten und andererseits häufig auftretende Fehler zu erkennen und zu korrigieren. Dies sollte den teilnehmenden Nutzern zu besserem Code verhelfen und für die Zukunft eine Ressource schaffen, auf die andere Nutzer verwiesen werden können.

Die beiden bisher durchgeführten Mitmach-Projekte sind

Leider hielt sich die Resonanz insgesamt, aber gerade die Resonanz unter Programmieranfängern in Grenzen, sodass die eingereichten Lösungen fast ausnahmslos von fortgeschritteneren Programmierern stammen und entsprechend wenig Anlass zur Diskussion bieten, was natürlich nicht im Sinne des Grundgedankens ist.

Hauptsächlich mit der Absicht, die Aktion zu unterstützen, habe ich bei beiden Projekten mitgemacht und jeweils eine Variante eingereicht. Für das Gästebuch existiert ein eigener Diskussionsthread, die Abgabe zum Kontaktformular ist Teil des zugehörigen Hauptthreads. Der Quellcode liegt in beiden Fällen als Repository bei Bitbucket vor (genauere Angaben in den verlinkten Forenposts).

Beide Projekte sind keine Meisterwerke und in vielerlei Hinsicht ausbaufähig, könnten aber dennoch für den ein oder anderen Anfänger einen Blick wert sein. Gleiches gilt natürlich für die beiden Hauptthreads zu den gestellten Projektaufgaben und für die Lösungen der anderen Teilnehmer.

Beim Kontaktformular-Projekt habe ich zusätzlich versucht, eine ausführlichere Dokumentation zum Code zu schreiben. Sie liegt als vorkompiliertes PDF-Dokument vor, ist aber auch in Quellcode-Form (LaTeX) im Repository enthalten. Inhalt und Quellcode können unter der CC-BY-SA-Lizenz frei weiterverwendet werden. Die aktuelle Version richtet sich vor allem an interessierte Anfänger, die über ein solides Grundlagenwissen verfügen und nun in die theoretischeren Aspekte der Projektumsetzung einsteigen möchten.

Es ist insgesamt schade – wenn auch nicht sonderlich überraschend –, dass die Aktion relativ schlecht angenommen wurde. Derlei Angebote sind definitiv unterstützenswert.

Nov 12 2010 • by Marc Ermshaus • type=post language=de php0 comments

Developers Shame Day: You've Come a Long Way, Baby

Für heute, den 3. November, hat Cem Derin den Developers Shame Day (oder wie ich sage: DSDS ohne Superstar) angesetzt. An diesem Tag bekommen PHP-Programmierer die Gelegenheit, ein besonders gruseliges Stückchen Code aus der Quarantäne zu holen, um es zum gegenseitigen Amüsement und gewissermaßen als Warnung für kommende Generationen öffentlich im Internet vorzustellen. Klingt verlockend, oder?

Ich frage mich noch immer, was er mit diesem Aufruf zu implizieren versucht.

Kunstpause.

Na ja, jedenfalls hat diese völlig absurde und abwegige Idee in einem vernunft-verspottenden Ausbruch kollektiv-masochistischer Natur dennoch allgemeinen Anklang gefunden, sodass am heutigen Tage allerorts die Codebäume Stilblüten tragen dürften. Ist das schön.

Ich persönlich, der ich nach meiner QBasic-Zeit selbstredend nie wieder auch nur eine einzige Zeile schlechten Code verfasst habe, musste -- in aller Bescheidenheit -- die zurückliegenden Commits von mehreren Tagen durchgehen, um Kandidaten zu finden, die sich in dieses wogende novemberliche Blütenmeer hätten einreihen können. Vorstellen möchte ich allerdings keinen Code von vorgestern, sondern Teile einer Klasse, die laut Repository im Jahre 2005, laut Gefühl irgendwann in den späten 1930ern entstanden ist.

Die Aufgabe dieser Klasse mit dem vielsagenden Namen CParser besteht (natürlich) nicht im Parsen von C-Code, sondern im Umwandeln einer stark BBCode-lastigen Auszeichnunssprache in feinsäuberliches HTML-Markup. Das genaue Vorgehen der Klasse möchte ich dabei zum gegenwärtigen Zeitpunkt als den 700/0-Algorithmus beschreiben, wie in 700 Zeilen Code, gefühlte 0 Zeilen Kommentar.

Den schmutzigeren Details der Implementation kann ich allerdings einen kurzen Lichtblick in Form eines Eingabe/Ausgabe-Beispiels voranstellen.

Eingabe:

[h1]Hallo Welt[/h1]

Hier ist ein Absatz.[fn]

[fnt]Das hier ist eine Fußnote.[/fnt]

Hier ist noch ein [url=http://example.org]Absatz[/url].

Ausgabe:

<h1>Hallo Welt</h1>

<p>Hier ist ein Absatz.<a class="footnote" href="#fn0-1" title="Zu Fu&szlig;note 1 springen"><span class="hide">[</span>1<span class="hide">]</span></a></p>

<p>Hier ist noch ein <a href="http://example.org" title="Zur Seite &quot;http://example.org&quot; wechseln">Absatz</a>.</p>

<div class="footnotes">
  <ul>

    <li><a name="fn0-1" class="footnote" id="fn0-1"><span class="hide">[</span>1<span class="hide">]</span></a> Das hier ist eine Fußnote.</li>
  </ul>
</div>

Na, dagegen sehen die WordPresseseses dieser Welt doch alt aus, das haben wir schon viel schlechter gesehen. Die Parser-Klasse fügt selbsttätig <p>-Tags hinzu und verfügt über einen vorzüglichen Fußnoten-Mechanismus, der es gestattet, die Fußnotenposition [fn] an einer anderen Stelle als den tatsächlichen Fußnotentext [fnt] in den Code zu schreiben. Ich für meinen Teil bin von meinem früheren Selbst schwer beeindruckt.

Noch schwerer beeindruckt wäre ich jedoch von ebenjenem früheren Selbst, wenn die Geschichte auch für diejenigen Eingaben funktionieren würde, für die sie dann nämlich in der Tat nicht mehr funktioniert. Das sind leider so ziemlich alle. Folgen die Eingaben nicht einer genau ausbalancierten Reihenfolge oder ist die BBCode-Struktur gar fehlerhaft, läuft irgendwo in den 700 unkommentierten Codezeilen etwas schief und die erzeugte HTML-Ausgabe wird witzig und kunstvoll anzusehen, aber leider nicht richtig.

Daraus lassen sich drei wesentliche Versäumnisse ableiten.

  1. Es wurde versäumt, irgendeine Art von Fehlerbehandlung zu schreiben. Der Code enthält schlicht und ergreifend keine Logik dazu. Es gibt keine Exceptions, es gibt keine Rückgaben von Fehlerwerten. Sobald ein unerwarteter Zustand auftritt, passiert ein klar definiertes Irgendwas.
  2. Es wurde zudem versäumt, den Code sinnvoll zu strukturieren. Die Klasse CParser ist im Grunde ein God object, das nichts weiter tut, als prinzipiell prozeduralen Programmierstil in einem Objekt zu verpacken. Dieses Objekt dient dann strenggenommen nur als Namespace, nicht aber als Teil einer OOP-Hierarchie. Features wie die Fußnoten-Logik wurden nachträglich in den bestehenden Code gehackt und sind entsprechend schwach integriert und blähen die vorhandenen Methoden mit Spezialfällen auf, statt sich irgendeiner definierten Schnittstelle unterzuordnen.
  3. All das macht die Klasse nahezu untestbar, da einzelne Funktionen nicht losgelöst vom Ganzen betrachtet werden können. Einige Methoden sind zudem mit einer Länge von 80 oder sogar 130 Zeilen voller if-Konstrukte ein klarer Indikator dafür, wo Funktionalität in weitere Klassen abstrahiert werden müsste.

Diese drei, mit den bereits erwähnten fehlenden Kommentaren vier, Punkte sind ein sicherer Weg, jeden Code auf Designebene vor die Wand zu fahren, sobald ein gewisser Grad an Komplexität erreicht wird. Die eingesetzten Algorithmen können dabei noch so gut und fehlerfrei sein, irgendwann bricht das Code-Gebilde gewissermaßen unter seinem eigenen Gewicht zusammen, weil die Übersicht verloren geht und weil auftretende Bugs nur noch sehr schwer überhaupt zu lokalisieren, geschweige denn zu beheben sind.

Zur Verdeutlichung des Elends zeige ich beispielhaft die ParseEx-Methode, die den äußeren Wrapper des Markup-Parsers darstellt, in der Originalversion von 1930. Der restliche Code sieht ungefähr genauso aus.

/*
*/
function ParseEx($s) {
    $this->PrepareString($s);
    $ret = "";
    $i = 0;

    if ($s == "") return;

    $this->GetNextTag($s, $i, $j, $tag);

    // Add <p> if source contains no tags or does not start with outline tag
    if ((!($j == 0)) || ($j === FALSE) )$ret .= "\n\n<p>";
    else if ((!($j === FALSE)) && (array_search($this->GetTagName($tag), $this->m_outline_tags) === FALSE))
      $ret .= "\n\n<p>";

    while (!($j === FALSE)) {
        $tag_content = "";
        $has_content = FALSE;
        $previous_text = trim($this->FormatString(substr($s, $i, $j - $i)));
        $ret .= $this->FormatString(substr($s, $i, $j - $i));
        if ($this->GetIsTag($tag)) {
            if (!($this->GetIsClosingTag($tag))) {

                /*
                    Opening Tag
                */

                if (!(array_search($this->GetTagName($tag), $this->m_outline_tags) === FALSE)) {

                    // Add </p> if opening outline tag does not follow closing outline tag
                    if ((!($j == 0)) && ((array_search($previous_tag_name, $this->m_outline_tags) === FALSE)||(!($previous_text == ""))))
                    {
                        $ret = trim($ret);
                        $ret .= "</p>\n\n";
                    }

                    $tag_content = $this->GetTagContent($s, $j, $tag);
                    $has_content = TRUE;
                    $this->DebugAdd("Tag Content: $tag_content\n");
                }
                elseif ($this->GetTagName($tag) == "fnt") {
                    $tag_content = $this->GetTagContent($s, $j, $tag);
                    $has_content = TRUE;
                    $this->DebugAdd("Tag Content: $tag_content\n");
                }
                elseif (array_search($this->GetTagName($tag), $this->m_single_tags) === FALSE) {
                    $this->AddToStack($tag);
                }

                $ret .= $this->AddCode($tag, TRUE, $tag_content);

                if (($has_content) && (!($this->GetTagName($tag) == "fnt"))) {
                    /* Add Paragraph: Closing outline tag but not fnt */

                    // Get next tag
                    $this->GetNextTag($s, $j, $tag_pos, $next_tag, FALSE);

                    $next_tag_name = $this->GetTagName($next_tag);

                    if (!($tag_pos === FALSE))
                    {
                        $str_between = substr($s, $j, $tag_pos - $j);
                        $str_trimmed = ltrim($str_between);

                        if (  (!($str_trimmed == "")) || (array_search($next_tag_name, $this->m_outline_tags) === FALSE) )
                        {
                            $ret .= "\n\n<p>";
                            $j += strlen($str_between) - strlen($str_trimmed);
                        }

                    }
                    else
                    {
                        $str_trimmed = trim(substr($s, $j));
                        if (!($str_trimmed=="")) $ret .= "\n\n<p>";
                    }
                }

            } else {
                /*
                    Closing Tag (with closing tag behaviour)
                */

                $ret .= $this->AddCode($tag, FALSE);
                $this->RemoveFromStack($tag);
            }
            $previous_tag_name = $this->GetTagName($tag);
        }
        else
          $ret .= $this->FormatString($tag);

        if (!$has_content)
          $i = $j + strlen($tag);
        else
          $i = $j;

        $this->GetNextTag($s, $i, $j, $tag);
    }

    // Restlichen Text anfügen und unter Umständen den letzten Absatz schließen

    $str_end = substr($s, $i);
    if ((!($str_end == "")) && ($this->GetIsOutlineTag($previous_tag_name)))
    {
        $i = 0;
        while ((($str_end[$i] == "\n") || ($str_end[$i] == "\r")) && ($i < strlen($str_end)) )
          $i++;
        $str_end = substr($str_end, $i);
    }

    if (!trim(($str_end == "")))
    {
        $ret .= $this->FormatString($str_end);
        $ret = trim($ret);
        $ret .= "</p>\n\n";
    }
    else if (!$this->GetIsOutlineTag($previous_tag_name))
    {
        $ret = trim($ret);
        $ret .= "</p>\n\n";
    }

    /*
        BAD SOLUTION FOR PARAGRAPH PROBLEM
    */
    $ret = $this->SolveParagraphs($ret);

    return $ret;
}

Alles klar, oder?

Ein auf diese Weise organisch gewachsener Code widersetzt sich in aller Regel sehr erfolgreich dem Versuch des Zurechtstutzens und wuchert munter in alle Richtungen weiter. Da hilft meist nur noch die radikale Variante, den entsprechenden Programmteil völlig neu und hoffentlich durchdachter zu konzipieren.

Einige Hinweise dazu:

  • Eine Methode sollte eine klar definierte Funktion erfüllen. Sie sollte zu einer eindeutigen Eingabe eine eindeutige Ausgabe erzeugen und so wenige Seiteneffekte (etwa das Manipulieren von Instanzvariablen) wie möglich verursachen.
  • Jeder mehrfach vorkommende Code sollte als Methode ausgelagert werden. Don't repeat yourself. Dazu gab es im Software-Entwickler Blog kürzlich einen Artikel.
  • Gleichartige Funktionalität sollte in zusätzliche Klassen ausgelagert werden. Der Beispielcode oben sollte etwa einen Aufruf wie $tag->getName() enthalten statt $this->GetTagName($tag).
  • Kommentare und vor allem auch frühzeitig konzipierte Tests helfen dabei, den Umfang einer Methode oder Klasse nicht aus dem Ruder laufen zu lassen, weil sie den Programmierer zur Reflexion über den geschriebenen Code zwingen.

So, ich werde den Code jetzt in eine Lochkarte stanzen und ans Nixdorf-Museum schicken oder alternativ in den nächsten Ententeich werfen.

Nov 3 2010 • by Marc Ermshaus • type=post language=de php0 comments

PHP/XSL Snippet: Replacing line breaks with <br>

<?php

$xmlCode = <<<EOT
<test>Ein Zeilenumbruch...
...und noch einer...
...und Schluss.</test>
EOT;

$xslCode = <<<'EOT'
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">  

  <xsl:template name="replace">
    <xsl:param name="string"/>
    <xsl:param name="from"/>
    <xsl:param name="to"/>
    <xsl:choose>
      <xsl:when test="contains($string, $from)">
        <xsl:value-of select="substring-before($string, $from)"/>
        <xsl:copy-of select="$to"/>
        <xsl:call-template name="replace">
          <xsl:with-param name="string"
                          select="substring-after($string, $from)"/>
          <xsl:with-param name="from" select="$from"/>
          <xsl:with-param name="to" select="$to" />
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$string" />
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

  <xsl:template match="test">
    <p>
      <xsl:call-template name="replace">
        <xsl:with-param name="string" select="."/>
        <xsl:with-param name="from" select="'&#xA;'"/>
        <xsl:with-param name="to"><br /></xsl:with-param>
      </xsl:call-template>
    </p>
  </xsl:template>
  
</xsl:stylesheet>
EOT;

$xmldoc = new DOMDocument();
$xmldoc->loadXML($xmlCode);
$xsldoc = new DOMDocument();
$xsldoc->loadXML($xslCode);

$proc = new XSLTProcessor();
$proc->importStyleSheet($xsldoc);
$tmp = $proc->transformToDoc($xmldoc);

#header('content-type: text/plain');
echo $tmp->saveXML($tmp->documentElement);

Output (HTML):

A line break...
...and another one...
...and that's enough.

Output (source):

<p>A line break...<br/>...and another one...<br/>...and that's enough.</p>

Oct 18 2010 • by Marc Ermshaus • type=post language=en php0 comments

PHP Snippet: Custom error handler

<?php

class ErrorController
{
    public function errorAction(Exception $exception)
    {
        echo '<h2>There was an error</h2>'
           . '<pre>' . $exception . '</pre>';
    }
}

$err = new ErrorController();

// Reroute all errors to our controller
set_exception_handler(array($err, 'errorAction'));
set_error_handler(
    function ($errno, $errstr, $errfile, $errline, array $errcontext = null)
    {
        throw new ErrorException($errstr, 0, $errno, $errfile, $errline);
    });



// Let's test it out

error_reporting(-1);

try {
    throw new Exception('Gonna catch this');
} catch (Exception $e) {
    // Catching errors works in the normal way
}

// Test 1 (default exception, comment the following line to enable Test 2)
throw new Exception('Something went wrong');

// Test 2 (default warning, will be transformed to an Exception and handled by
//         the controller)
echo 1/0;

Oct 8 2010 • by Marc Ermshaus • type=post language=en php0 comments

Greasemonkey: Auto-"read on one page" for news sites

Here's a very basic Greasemonkey script that enables the "read on one page" feature by default for a number of news sites. It currently works with nytimes.com and zeit.de. (Yes, that's two. Yes, that's not a huge number.)

// ==UserScript==
// @name           read on one page
// @namespace      http://ermshaus.org/
// @include        *
// ==/UserScript==

(function()
{
    function getParam(name)
    {
        var regexS = "[\\?&]" + name + "=([^&#]*)";
        var regex = new RegExp(regexS);
        var tmpURL = window.location.href;
        var results = regex.exec(tmpURL);

        if (results == null) {
            return "";
        }

        return results[1];
    }

    if (window.location.hostname === 'www.zeit.de') {
        if (getParam('page') === '') {
            var snapshot = document.evaluate(
                '//a[@href="?page=all"]',
                document,
                null,
                XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,
                null
            );
            if (snapshot.snapshotLength >= 1) {
                window.location.href += '?page=all';
            }
        }
    } else if (window.location.hostname === 'www.nytimes.com') {
        if (getParam('pagewanted') === '') {
            var snapshot = document.evaluate(
                '//a[contains(@href, "pagewanted=all")]',
                document,
                null,
                XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,
                null
            );
            if (snapshot.snapshotLength >= 1) {
                window.location.href += '&pagewanted=all';
            }
        }
    }
})();

Install via Greasemonkey auto-detection.

There's a lot of room for improvement. For instance, the redirect should be executed earlier than the time at which Greasemonkey runs ("DOM ready" event). That would need to be done with a real Firefox extension, I guess. Besides that, the XPath expression that is used to detect whether there's a "read on one page" version is quite hacky and will break easily. There certainly are much better scripts around, I didn't check.

Nevertheless, it's much more useful than I thought.

I might update the script and this post at some point in the future, but don't hold your breath.

Sep 24 2010 • by Marc Ermshaus • type=post language=en0 comments

PHP: A helper function for rendering calendar table layouts

/**
 * Returns the table layout of a given month as an array of rows.
 * 
 * <p>Padding will be added as necessary.</p>
 *
 * <p>Example: Return value for May 2010</p>
 *
 * <pre>
 * array(
 *     array(null, null, null, null, null,    1,    2),
 *     array(   3,    4,    5,    6,    7,    8,    9),
 *     array(  10,   11,   12,   13,   14,   15,   16),
 *     array(  17,   18,   19,   20,   21,   22,   23),
 *     array(  24,   25,   26,   27,   28,   29,   30),
 *     array(  31, null, null, null, null, null, null)
 * )
 * </pre>
 *
 * @param   int $year         Year to generate the layout of.
 * @param   int $month        Month to generate the layout of.
 * @param   int $firstWeekday Weekday of first column. Works like date('w').
 *                            Defaults to monday.
 * @throws  IllegalArgumentException
 * @author  Marc Ermshaus <marc@ermshaus.org>
 * @version 2010-05-13
 * @example http://ermshaus.org/2010/05/php-a-helper-function-for-rendering-calendar-table-layouts
 * @return  array
 */
function getMonthLayout($year, $month, $firstWeekday = 6)
{
    $year         = (int) $year;
    $month        = (int) $month;
    $firstWeekday = (int) $firstWeekday;

    if ($month < 1 || $month > 12) {
        throw new InvalidArgumentException(
                '$month has to be between 1 and 12.');
    }

    if ($firstWeekday < 0 || $firstWeekday > 6) {
        throw new InvalidArgumentException(
                '$firstWeekday has to be between 0 and 6.');
    }

    $data = array();
    
    $dt = new DateTime($year . '-' . $month . '-01');

    $t = $dt->format('t');
    $w = $dt->format('w');
    $w = ($w + $firstWeekday) % 7;

    // Add padding values to first row
    if ($w == 0) {
        $row = array();
    } else {
        $row = array_fill(0, $w, null);
    }

    for ($i = 1; $i <= $t; $i++) {
        $row[] = $i;
        if (($i + $w) % 7 == 0) {
            $data[] = $row;
            $row = array();
        }
    }

    // Add padding values to last row
    $k = (7 - (($t + $w) % 7)) % 7;

    if ($k > 0) {
        $row =  array_merge($row, array_fill(0, $k, null));
        $data[] = $row;
    }

    return $data;
}

The following example renders all months from the year 2010.

<?php

$year  = 2010;
$shift = 6; // Set first column to monday

$days = array('SU', 'MO', 'TU', 'WE', 'TH', 'FR', 'SA');

// Ring shift weekday names according to $shift parameter
for ($i = 0; $i < $shift; $i++) {
    $popped = array_pop($days);
    array_unshift($days, $popped);
}

$date = new DateTime();

?>

<?php for ($month = 1; $month <= 12; $month++):

    $data = getMonthLayout($year, $month, $shift);
    $date->setDate($year, $month, 1);
    echo '<p>' . $date->format('F Y') . '</p>';
?>

<table border="1">
<tr>
    <th>
        <?php echo implode('</th><th>', $days); ?>
    </th>
</tr>
<?php foreach ($data as $row): ?>
<tr>
    <?php foreach ($row as $day): ?>
        <?php if ($day == null): ?>
            <td>&nbsp;</td>
        <?php else: ?>
            <td><?php echo $day; ?></td>
        <?php endif; ?>
    <?php endforeach; ?>
</tr>
<?php endforeach; ?>
</table>

<?php endfor; ?>

May 13 2010 • by Marc Ermshaus • type=post language=en programming php snippet0 comments

PHP: Using XML as a "lightweight" markup language

I have never been a particular fan of lightweight markup languages that introduce their own highly optimized syntax, like Markdown or Textile. Although they should be sufficient in most cases, some of their parsing rules always seemed a bit unstable and ambiguous to me. I would use them for smaller tasks (user comments or simple content editing), but I never found them to be flexible enough for anything more "sophisticated." My major quarrel has always been extensibility. I wanted to have a markup language that is 100 % extensible in an elegant way that seamlessly integrates with the existing syntax.

For quite some time, my answer was a modified version of BBCode that could be transformed into a tag-based tree structure by a custom parser. Using this approach, I was able to define a basic set of tags which could be extended dynamically by any number of new tags designed for different purposes. For instance, apart from the default HTML markup tags like [h1] oder [ul], I created a plugin that added a [youtube] tag to the set of available tags. This tag took a YouTube video id as an attribute and was transformed into the corresponding code for YouTube video embedding during the rendering routine.

Besides: A different, more business-oriented example would be markup like [article id="12345" mode="preview"] that might add a database-driven info box with a nice product image (à la Amazon) to the output. But for the sake of simplicity, we will stick with examples that are easier to implement.

This system worked quite well, but it always bothered me that I had to add a lot of tags to the markup that would be transformed to HTML output just by replacing the framing BBCode square brackets by HTML's angle ones. That felt rather pointless. So, during the last major overhaul of my website, I gave this some thought and finally, after a lengthy conversation with a friend, it became obvious to me that all I ever wanted as a markup language was indeed a custom version of XHTML. All I had to do was to write HTML in its XML-compliant syntax and add custom XML tags to the markup that would be transformed to standard HTML through rules defined in the parser.

The obvious way to perform the actual transformation from a custom XML markup dialect to HTML is via an XSLT stylesheet. Thankfully, this can be implemented pretty easily, because PHP's DOM extension offers a comprehensive set of classes for working with XML trees, for applying XSL transformations, or for running XPath queries. During the remainder of this article, I will give you a simple example on how it might be done.

Something to work with

Let us start with some rather self-explanatory front-end code (index.php):

<!DOCTYPE html>

<html>

    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
        <title>XML Markup</title>
    </head>

    <body>

        <?php
        if (isset($_POST['content'])) {
            $output = render($_POST['content']);
            echo '<pre>' . htmlspecialchars($output) . '</pre>';
            echo $output;
        }
        ?>

        <form method="post" action="">
            <textarea name="content" cols="80" rows="20"><?php
            if (isset($_POST['content'])) {
                echo htmlspecialchars($_POST['content']);
            } else {
                echo htmlspecialchars("<h1>Hello World!</h1>\n<p>Content goes here</p>");
            }
            ?></textarea>
            <p><input type="submit" value="Go" /></p>
        </form>

    </body>

</html>

The code creates a page containing a textarea which holds the custom XML code that should be rendered by clicking the submit button. Once that happens, the submitted XML code string will be transformed to HTML via the render function (which we will add in a second) and displayed both in rendered form and in source code form. For convenience, the XML input is again written into the textarea.

Regarding the XSLT stylesheet, the most simple version does nothing but transform the input to itself, e. g. it does not apply any modifications (transform.xsl):

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:php="http://php.net/xsl">
    <xsl:output method="xml" encoding="UTF-8" indent="yes"/>

    <!-- The identity template -->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

The XSL rule featured in the example is called the "identity template." As I do not intend to go into the details of XSL, please refer to a different resource if you have trouble understanding this or one of the following stylesheets.

The next step is to add the render function at the top of index.php:

<?php

// Everybody loves magic quotes
if (isset($_POST['content']) && get_magic_quotes_gpc()) {
    $_POST['content'] = stripslashes($_POST['content']);
}

function render($xmlCode)
{
    // XML documents need one distinct root tag
    $xmlCode = '<root>' . $xmlCode . '</root>';

    $xmldoc = new DOMDocument();
    $xmldoc->loadXML($xmlCode);
    $xsldoc = new DOMDocument();
    $xsldoc->load('./transform.xsl');

    $proc = new XSLTProcessor();
    $proc->importStyleSheet($xsldoc);

    $tmp = $proc->transformToDoc($xmldoc);

    // Strip <root> tag and return processed XML
    return substr($tmp->saveXML($tmp->documentElement), 6, -7);
}

?><!DOCTYPE html>
...

Fire up the example in a browser, type in some HTML code (or leave the default content), and click the submit button. If your PHP distribution is configured correctly, you should see your input as processed by the XSLT stylesheet. For further explanations on how this code works, please consult the corresponding part of the official PHP documentation.

Simple XSL transformations (<youtube> tag)

Custom tags may now be added to the markup by simply appending corresponding transformation rules to the XSLT stylesheet.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:php="http://php.net/xsl">
    <xsl:output method="xml" encoding="UTF-8" indent="yes"/>

    <!-- YouTube tag -->
    <xsl:template match="youtube">
        <object type="application/x-shockwave-flash"
                width="425"
                height="350"
                data="http://www.youtube.com/v/{@id}"
        >
            <param name="movie"
                   value="http://www.youtube.com/v/{@id}&amp;hl=en&amp;fs=0"
            />
        </object>
    </xsl:template>

    <!-- The identity template -->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

This rule introduces a <youtube id="xyz" /> tag that gets transformed to the correct HTML code for YouTube video embedding by the XSL processor.

Here is a snippet to try it out:

<h1>YouTube tag test</h1>

<p>
    <youtube id="4XpnKHJAok8" />
</p>

It should not be hard to see how powerful XSL transformations are even without additional back-end processing. But it gets even more interesting if XSL rules are connected with server-side PHP callbacks.

XSL transformations using PHP callbacks (<php> tag)

To illustrate the idea of PHP callbacks in XSL, we are going to create a <php> tag that is used to display PHP soure code with proper syntax highlighting.

The additional "PHP tag" XSL rule is rather short. Here is the complete transformation stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:php="http://php.net/xsl">
    <xsl:output method="xml" encoding="UTF-8" indent="yes"/>

    <!-- YouTube tag -->
    <xsl:template match="youtube">
        <object type="application/x-shockwave-flash"
                width="425"
                height="350"
                data="http://www.youtube.com/v/{@id}"
        >
            <param name="movie"
                   value="http://www.youtube.com/v/{@id}&amp;hl=en&amp;fs=0"
            />
        </object>
    </xsl:template>

    <!-- PHP tag -->
    <xsl:template match="php">
        <pre>
        <xsl:copy-of select="php:function('hl', string(.))" />
        </pre>
    </xsl:template>

    <!-- The identity template -->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

To allow the execution of PHP functions in the stylesheet, registerPHPFunctions needs to be called after the initialization of the XSLTProcessor instance. Additionally, the callback function, called hl (for "highlight"), needs to be defined.

<?php

// Everybody loves magic quotes
if (isset($_POST['content']) && get_magic_quotes_gpc()) {
    $_POST['content'] = stripslashes($_POST['content']);
}

function hl($s)
{
    $tmp = new DOMDocument();
    $code = highlight_string($s, true);
    // Ignore non-defined entity issues for now
    $code = str_replace('&nbsp;', ' ', $code);
    $tmp->loadXML($code);
    return $tmp;
}

function render($xmlCode)
{
    // XML documents need one distinct root tag
    $xmlCode = '<root>' . $xmlCode . '</root>';

    $xmldoc = new DOMDocument();
    $xmldoc->loadXML($xmlCode);
    $xsldoc = new DOMDocument();
    $xsldoc->load('./transform.xsl');

    $proc = new XSLTProcessor();
    $proc->registerPHPFunctions(); // NEW
    $proc->importStyleSheet($xsldoc);

    $tmp = $proc->transformToDoc($xmldoc);

    // Strip <root> tag and return processed XML
    return substr($tmp->saveXML($tmp->documentElement), 6, -7);
}

?><!DOCTYPE html>
...

The hl function uses PHP's built-in highlight_string function to do the actual highlighting. The return value has to be a DOM node instead of a simple string because it should be added as proper HTML code to the transformed output. Otherwise, tag delimiters would get escaped and the final output would contain the HTML source code used to do the highlighting instead of the rendered highlighting.

An example snippet:

<php><![CDATA[
<?php
function helloWorld()
{
    // Say hello
    echo 'Hello World!';
}
]]></php>

The example is wrapped with a <![CDATA[ ... ]]> container to be able to use < and > in their non-entity form in the source code. As we need to write valid XML code, this is a necessity.

Conclusion

I am quite satisfied with this approach to a custom markup language, although I admit that writing valid XML code can be a bit of a hassle. Nevertheless, XML is a very well-defined and widespread format that can be processed by a lot of existing tools. The syntax is 100 % non-ambiguous, transformable, seamlessly extensible, and rather easy to learn if your users have basic knowledge of HTML or a comparable markup dialect like BBCode. I also assume that some of the JavaScript-based HTML editors can be extended by custom XML tags so that UI-based editing should be a possibility.

Apr 19 2010 • by Marc Ermshaus • type=post language=en programming php xml dom0 comments

PHP: Adding grouping and extended sorting to SPL's ArrayObject

Regarding the frequency of questions about the topic, one thing many programmers seem to have difficulties with is a technique called "control break". Basically, this is a way of displaying data grouped into visually separated hierarchical sections. Examples might be a list of employees grouped by the starting letter of their family names or a list of blog posts grouped first by year and second by month. Every time the index key of one of the sections changes, a visual mark, like a new heading, should be printed. Those changes are defined as control breaks.

Although it's not really a big deal to write a simple algorithm to achieve the desired effect, I thought it might be desirable to have a generic solution to the problem. So I created a helper class containing static methods that would transform an array of entries (homogeneous arrays of key/value pairs) into a grouped array using the unique values of one of the entries' fields as grouping key. That worked quite well but I am not a big fan of calling static class methods if feasible alternatives exist. A more object-centric approach in which the grouping code could be run as an instance method seemed to me to be the superior solution. Following this thought, I wrote an extended version of ArrayObject from the SPL which I'd like to introduce in this post.

Basic grouping (groupBy)

The class I came up with is named Kaloa_Spl_ArrayObject (you can view or download it here). It's designed to be an unobtrusive addition to ArrayObject. In the current version, the constructor from the parent class is overriden but everything else is left intact. In order to show how it works, I'll define some data that might roughly resemble a list of articles from a blogging application.

$items = array(
    array('year' => 2009, 'month' =>  9, 'title' => 'Hello World!'),
    array('year' => 2009, 'month' =>  9, 'title' => 'At the museum'),
    array('year' => 2009, 'month' =>  9, 'title' => 'Godspeed'),
    array('year' => 2009, 'month' =>  9, 'title' => '2010 Olympics'),
    array('year' => 2010, 'month' =>  1, 'title' => 'Tornado season'),
    array('year' => 2010, 'month' =>  1, 'title' => 'Bailout'),
    array('year' => 2010, 'month' =>  2, 'title' => 'Cheers, Ladies!'),
    array('year' => 2010, 'month' =>  2, 'title' => 'Neglected'),
    array('year' => 2009, 'month' => 11, 'title' => 'Ethics probe'),
    array('year' => 2010, 'month' =>  3, 'title' => 'Commitment to security'),
    array('year' => 2010, 'month' =>  3, 'title' => 'Election'),
    array('year' => 2009, 'month' => 10, 'title' => 'Same-sex couples'),
    array('year' => 2009, 'month' => 10, 'title' => 'Junkyard'),
);

The most interesting new method of Kaloa_Spl_ArrayObject, the grouping method, is groupBy. It takes a callback function as argument which is called once for every entry in the original array. The method might be used to group the example data by year and month.

$obj = new Kaloa_Spl_ArrayObject($items);

$obj->groupBy(
    create_function(
        '$item',
        'return array($item["year"], $item["month"]);'
    )
);

The return value of the callback function is the key of the group to which the corresponding entry will be assigned. If an array is returned, it will be treated as a multi-dimensional key which translates to a multi-level grouping.

Displaying the content of $obj using var_dump or print_r will result in an array structured like this:

Kaloa_Spl_ArrayObject Object(
    [2009] => Kaloa_Spl_ArrayObject Object(
        [9]  => Kaloa_Spl_ArrayObject Object(...),
        [11] => Kaloa_Spl_ArrayObject Object(...),
        [10] => Kaloa_Spl_ArrayObject Object(...)
    ),
    [2010] => Kaloa_Spl_ArrayObject Object(
        [1] => Kaloa_Spl_ArrayObject Object(...),
        [2] => Kaloa_Spl_ArrayObject Object(...),
        [3] => Kaloa_Spl_ArrayObject Object(...)
    )
)

The third dimension contains a numbered array with all of the original entries that are part of the corresponding group. For instance, the content of $obj[2009][9] would be an array with the four entries from September 2009:

0 => Kaloa_Spl_ArrayObject(
    'year' => 2009,
    'month' => 9,
    'title' => 'Hello World!'
),
1 => Kaloa_Spl_ArrayObject(
    'year' => 2009,
    'month' => 9,
    'title' => 'At the museum'
),
2 => Kaloa_Spl_ArrayObject(
    'year' => 2009,
    'month' => 9,
    'title' => 'Godspeed'
),
3 => Kaloa_Spl_ArrayObject(
    'year' => 2009,
    'month' => 9,
    'title' => '2010 Olympics'
)

As Kaloa_Spl_ArrayObject subclasses ArrayObject, it's already possible to print the data in the desired fashion using nested foreach-loops.

foreach ($obj as $year => $yearContent) {
    echo '<h1>' . $year . "</h1>\n";
    foreach ($yearContent as $month => $monthContent) {
        echo '  <h2>' . $month . "</h2>\n";
        echo "    <ul>\n";
        foreach ($monthContent as $entry => $entryContent) {
            echo '      <li>' . $entryContent['title'] . "</li>\n";
        }
        echo "    </ul>\n";
    }
}

The resulting HTML code:

<h1>2009</h1>
  <h2>9</h2>
    <ul>
      <li>Hello World!</li>
      <li>At the museum</li>
      <li>Godspeed</li>
      <li>2010 Olympics</li>
    </ul>
  <h2>11</h2>
    <ul>
      <li>Ethics probe</li>
    </ul>
  <h2>10</h2>
    <ul>
      <li>Same-sex couples</li>
      <li>Junkyard</li>
    </ul>
<h1>2010</h1>
  <h2>1</h2>
    <ul>
      <li>Tornado season</li>
      <li>Bailout</li>
    </ul>
  <h2>2</h2>
    <ul>
      <li>Cheers, Ladies!</li>
      <li>Neglected</li>
    </ul>
  <h2>3</h2>
    <ul>
      <li>Commitment to security</li>
      <li>Election</li>
    </ul>

Basically, that's all there is to it.

Advanced grouping

In some cases, it might be useful to modify entries before they are added to the resulting data structure. This can be achieved by simply editing or removing fields from the argument passed to the callback function. All arguments, including scalar values, are passed by reference.

This grouping function will remove the fields "year" and "month" from all entries of the resulting array and will change the content of the "title" field to all uppercase letters.

$obj->groupBy(
    create_function(
        '$item',
        '$ret = array($item["year"], $item["month"]);
         unset($item["year"]);
         unset($item["month"]);
         $item["title"] = strtoupper($item["title"]);
         return $ret;'
    )
);

An example using scalar values that will be grouped by the first letter and changed to uppercase:

$items = array('Carl', 'Susan', 'Cindy', 'Peter', 'Steve', 'Patricia', 'Sam');

$obj = new Kaloa_Spl_ArrayObject($items);

$obj->groupBy(
    create_function(
        '$item',
        '$item = strtoupper($item);
         return substr($item, 0, 1);'
    )
);

var_dump($obj);

Output:

object(Kaloa_Spl_ArrayObject)#1 (3) {
  ["C"]=>
  object(Kaloa_Spl_ArrayObject)#4 (2) {
    [0]=>
    string(4) "CARL"
    [1]=>
    string(5) "CINDY"
  }
  ["S"]=>
  object(Kaloa_Spl_ArrayObject)#5 (3) {
    [0]=>
    string(5) "SUSAN"
    [1]=>
    string(5) "STEVE"
    [2]=>
    string(3) "SAM"
  }
  ["P"]=>
  object(Kaloa_Spl_ArrayObject)#6 (2) {
    [0]=>
    string(5) "PETER"
    [1]=>
    string(8) "PATRICIA"
  }
}

Sorting (usort, usortm, uasortm, uksortm)

By now, it might have become apparent that the groupBy method doesn't sort the resulting array in any way. Therefore, I made a second major addition to ArrayObject by adding more sophisticated sorting functionality that is able to realign one or more dimensions of the array. All three multi-dimensional sorting methods are based on the different flavours of PHP's built-in usort function. They each take sorting criteria specified by an anonymous function or an array of anonymous functions as arguments.

Here is an example to illustrate the usage. It works with the data defined in the "Basic grouping" section.

$obj->groupBy(
    create_function(    // Group by year and month
        '$item',
        'return array($item["year"], $item["month"]);'
    )
)->uksortm(
    array(
        create_function(    // Order first dimension descending
            '$a, $b',
            'return $a < $b;'
        ),
        create_function(    // Order second dimension ascending
            '$a, $b',
            'return $a > $b;'
        )
    )
)->usortm(
    array(
        null,    // Skip first and second dimensions, only realign third
        null,    //  (descending by length of an entry's title)
        create_function(
            '$a, $b',
            'return strlen($a["title"]) < strlen($b["title"]);'
        )
    )
);

This notation uses method chaining in order to hint at the fact that I implemented a fluent interface for all new methods (with the exception of the usort method which I threw in because it was the only one missing). This might be split into three parts starting with $obj->, of course.

Besides the groupBy call, there are calls to both uksortm and usortm because the first two dimensions (years and months) have to be sorted by key whereas the third one (the entries) should be sorted by value. (By the way: usortm might be exchanged for uasortm here as well-formed keys are not an issue when iterating the array using foreach.) The differences between all of the usort-like functions are explained in the PHP documentation.

Each of the u*sortm ("m" standing for "multi-dimensional") methods recursively applies the passed functions to the corresponding dimension of the array. From an array of three functions, the first one would be used to sort the years (first dimension), the second one to sort the months (second dimension) and the third one to sort the entries (third dimension). If no function is needed for a specific dimension, null can be passed and the dimension is skipped.

Further documentation about the class may be found in the inline DocBlock comments of the source file. If you try it out and have questions or any remarks or bug reports, please contact me.

Mar 10 2010 • by Marc Ermshaus • type=post language=en programming php spl arrayobject0 comments

Songs I've enjoyed recently

  • Alanis Morissette – Hands Clean. "We'll forward to a few years later." What is wrong with me?
  • Razorlight – Wire to Wire. "What is love but the strangest of feelings? / A sin you swallow for the rest of your life? / You've been looking for someone to believe in / To love you, until your eyes run dry."
  • Coldplay – Fix You. Sane people like those at Pitchfork say: "'Lights will guide you home / And ignite your bones / And I will try to fix you,' sang Martin on X&Y's "Fix You", a gag-inducing bit of motivational flotsam that came off like self-parody." Coldplay might be the musical equivalent to a chick flick, but you know what? Don't worry about it.
  • Morrissey – First of the Gang to Die. "You have never been in love, / Until you've seen the stars, / reflect in the resevoirs. // And you have never been in love, / Until you've seen the dawn rise, / behind the home for the blind."
  • Milow – Ayo Technology. This is a cover of a song by 50 Cent and Justin Timberlake, so expect a certain amount of metaphorical explicitness. The "I'm tired of using technology" line is great.
  • Bob Dylan – Girl from the North Country. "I'm a-wonderin' if she remembers me at all. / Many times I've often prayed / In the darkness of my night, / In the brightness of my day."
  • David Bowie – Heroes. "Though nothing, nothing will keep us together / We can beat them, forever and ever / Oh, we can be heroes just for one day."
  • Virginie Ledoyen – Mon Amour, Mon Ami. "Toi mon amour, mon ami / Quand je rêve c'est de toi / Mon amour, mon ami / Quand je chante c'est pour toi".

It would have been more comfortable to create a playlist at YouTube, but I just don't believe in registering accounts and managing passwords any more (there's OpenID, you know).

Mar 21 2009 • by Marc Ermshaus • type=post music language=en0 comments

Untitled #1

m-          }püß´44444444444444444444444
4444444444444444444444444444444444444444
4444444444444444444444444444444444444444
4444444444444444444444444444444444444444
4444444444444444444444444444444444444444
4444444444444444444444444444444444444444
4444444444444444444444444444444444444444
4444444444444444444444444444444444444444
4444444444444444444444444444444444444444
4444njjjjjjjjjjjjjjjjjjjjjj8888888888888
888888888889hhhhh<7<<<<<<<<<<<<<<<<<<8  
 6hcm897

"Untitled #1" by Karlchen, 40x12 characters, cat on keyboard, 2009

Jan 28 2009 • by Marc Ermshaus • type=post language=en art0 comments

Backups

Backup your data. Do not create digital content of any value if you are not going to do a backup within a month or a week or a day. Hardware failure is just a one-meter drop away.

It won't happen to you because you are not stupid enough to knock your USB drive off your table? And you are going to buy this new computer in a few days anyway, so that you will be able to do a backup then? – Yeah, you are like me from an hour ago. Now go backup your data.

Dec 26 2008 • by Marc Ermshaus • type=post language=en0 comments

Re: Million's of Linux Users, they can't all be experts.

I wrote a comment on David Thomas' blog a while ago. (If you happen to be my father, please do not be offended.)

I think that it's very difficult for "normal" (not technical minded) users to switch to Linux. Even if they were able to evaluate whether they could use Linux in a productive way, migrating personal data and installing an operating system are likely to be tasks that are out of reach for many users and that they are afraid to do. This fear is not necessarily a bad thing. It might shield them from a lot of frustration caused by missing hardware support (mobile phones, printers[1]) or in the worst case data loss. Many people I know wouldn't dream of reinstalling an OS.

Take my father as an example. He spends a lot of time at the computer and he knows his way around. He would reinstall his OS. But he still sends me links to Windows software though I told him a dozen times that I'm running Linux and that my OS is (generally) unable to execute .exe files. A while ago, I also had a difficult time trying to convince him to stop using the AOL software though it is unnecessary and doesn't allow to export e-mail. Software lock-in is a concept he doesn't want to think about. The same goes for most other abstract issues that made us happy Linux users. You don't need to care if you just want to create a birthday invitation in MS Word or something like that.

Of course, you can do that in OO Writer under Ubuntu, too. This is true for most tasks. So I guess we could install a Linux distribution on my father's PC without leaving him out in the cold. But I honestly doubt that it will be beneficial to him. Besides, I would feel obliged to help him find software and to explain everything that he doesn't understand (in case I do). As long as he’s using Windows, I don't have that responsibility. That's why I partially think it's the best arrangement for both of us.

Sorry, I don't have a real conclusion for this comment. This is just something that bothered me for the last few days.

[1] FUD disclaimer: This is not Linux' fault. I know that. But it’s an issue nevertheless. Ask my Nokia.

A possible conclusion might be: Do not install an operating system onto other people's computers just because you think it would be a good idea.

Nov 23 2008 • by Marc Ermshaus • type=post language=en comments linux davidthomas0 comments

Re: I don't like tags

Some weeks ago, I wrote a comment on the blog post "I don't like tags" by Stephan Waba.

I don't like tags either.

1. Tags tend to be ambiguous. ("paris" – http://en.wikipedia.org/wiki/Paris_(disambiguation))

2. Tags depend on a specific implementation, taxonomy and/or naming scheme (which not all editors might be aware of). That's especially true if you try to avoid point 1. ("france/paris", "paris, france", "paris" + "france", "location:fr/paris", ...)

3. Tags are very unintuitive to use if you have to guess them out of thin air. ("jfk", "john f. kennedy", "kennedy, john f.", "kennedy")

4. Tags are hard to use in a consistent manner. (Did I add a "sports" tag to all occurrences of the "baseball" tag?)

5. Tags are lost in translation. ("paris", "pariz", "parys", "parigi", ...)

6. I have the feeling that I don't understand tags at all.

I like to think of tags as a way to add an object to multiple (sub-)categories of a huge hierarchical meta data taxonomy. For instance, I (try to) sort all of my photos into three different taxonomies which are like three different "views" onto the data: location, set and (pictured) person. That makes it pretty easy to find all images of Guillaume (person) that were taken in Paris, France (location) during a "weekend trip" (set) in April 2006 (image meta data). The mass of all photos becomes some kind of 4-dimensional cluster (date, location, set, person) in which I can find specific objects by filtering one or more dimensions using a condition (location=Paris, France). Every object for that all applied conditions are fulfilled (the intersection), is part of the subset I wanted to expose.

But I doubt that this is the "correct" way to think about tags.

Besides that, I have no idea what to use instead of them.

Nov 9 2008 • by Marc Ermshaus • type=post language=en tagging stephanwaba comments0 comments

Password vulnerabilities

While skimming through my old bookmarks, I found an article by Miguel de Icaza in which he talks about an incident of password theft at reddit and password vulnerabilities in general (both links taken from the article).

[M]any of my friends use combinations of 'the same password everywhere' (specially the non-technical), 'the password with the site name' (slightly more technical), 'three tiers of passwords: weak, normal and high-security'.

I belong to the group of users who try to remember a small number of different passwords for different levels of security. I don't like this approach, but everything more secure is a usability disaster if you have to access some of your accounts from different computers (which, on the other hand, is always a security disaster).

Evolving technologies like OpenID might be a solution. Using OpenID, account password data is stored on a central server and doesn't get exposed to every site on which the account is used. But there are risks, too: If, for some reason, the OpenID server doesn't respond, you will be unable to log into sites that depend on an OpenID account. And in case your OpenID account's password gets stolen and changed, the thief will be able to log into all sites linked with this account. I guess that's what security questions were invented for.

I think the most important point about password vulnerabilities is to be aware of them. Besides that, common sense is always a good thing: Do not log into your online banking account from a computer you do not control; always pick unique passwords for important accounts and keep them safe (= in your head or at least offline).

Trying to estimate the probability that someone is going to try to hijack one of your accounts would result in an expression like the Drake equation. – You just can't tell.

Oh, and do change your Google password every few weeks. Start now.

Nov 7 2008 • by Marc Ermshaus • type=post language=en migueldeicaza security reddit openid0 comments

When I Heard the Learn'd Astronomer

When I heard the learned astronomer,
When the proofs, the figures, were ranged in columns before me,
When I was shown the charts and diagrams, to add, divide, and
measure them
When I sitting heard the astronomer where he lectured with much,
applause in the lecture-room
How soon unaccountable I became tired and sick,
Till rising and gliding out I wander'd off by myself,
In the mystical moist night-air, and from time to time,
Look'd up in perfect silence at the stars.

Walt Whitman

Oct 23 2008 • by Marc Ermshaus • type=post language=en waltwhitman poems0 comments

Klangbilder (1)

Dies könnte der Auftakt zu einer losen Serie sein, in der ich auf frei herunterladbare Musik hinweise, die ich irgendwo aufgeschnappt habe und die mir aus mehr oder weniger nachvollziehbaren Gründen hörenswert erscheint.

Ich weiß noch nicht, ob ich an dieser Form festhalten werde. Auf den ersten Blick erscheint es mir praktischer, die Einträge sofort einzeln einzustellen, statt zu warten, bis sich ausreichend Material für einen eigenen Beitrag angesammelt hat. Aktuell konnte ich jedoch auf ein entsprechend großes Archiv zurückgreifen.

Inhaltlicher Zusammenhang der Einträge oder Trennungen nach Genre sind übrigens vorerst nicht Teil des Konzepts. Große Ahnung hätte ich davon ohnehin nicht. Zudem werde ich nach Möglichkeit Downloadseiten verlinken, nicht direkt die Dateien. Das gehört sich so.

Shaun Inman: The Lovely Life EP und Lot EP. Shaun Inmans Musik ist ruhig, dabei aber energetisch, getragen von gleichmäßigem Gesang, Gitarrenspiel und sparsam eingesetzten elektronischen Effekten. Sie drängt sich angenehm in den Hintergrund.

Minilogue: Liveset Berlin. 83 Minuten eleganter Minimaltechno aus Schweden. Gute elektronische Musik spielt geschickt mit Verzögerungselementen und der dadurch beim Zuhörer erzeugten Erwartungshaltung. Die Reaktionen des Publikums machen einen großen Teil des Reizes dieses Mitschnitts aus.

Internet Archive: Live Music Archive. Eine Sammlung qualitativ hochwertiger Liveaufnahmen verschiedener Künstler. Die meisten Interpreten sind (mir) unbekannt, wärmstens empfehlen kann ich allerdings die "Tucson, Arizona"-Fraktion mit Giant Sand und natürlich Calexico.

Jack Conte: VideoSongs. Der Witz an der Sache ist, dass hier nicht einfach zwei Songs gemischt werden, sondern Jack Conte jeden Ton irgendwann selbst eingespielt oder eingesungen hat. Genauere Angaben dazu etwa im Infobereich eines der zugehörigen Videos.

Juan Diego Flórez: La Fille du Régiment: Ah! Mes Amis. Der Tenor Juan Diego Flórez singt fehlerfrei neun hohe Cs hintereinander, bekommt stehende Ovationen, setzt ein kompliziertes Signalsystem in Gang und singt die erste Solo-Zugabe in der Metropolitan Opera in New York seit 1994. Eine nette Geschichte und eine Aufnahme, die man gut beim Zähneputzen hören kann.

Parts & Labor. Parts & Labor verbreiten Hektik mit rohem Elektrorock. Der Schlagzeuger muss vermutlich alle zwei Songs ausgewechselt werden. Leider nicht als Download angeboten: "The Gold We're Digging".

Jul 24 2008 • by Marc Ermshaus • type=post language=de music minilogue shauninman calexico giantsand jackconte juandiegoflorez partsandlabor0 comments