Summary:
I ranked 2nd place at the regional competition Young Scientists with the
programming of a chess computer that used advanced data structures like
bitboards. This project made me meet my later boss at GIP. At GIP I had my first
experience with data mining. I developed a tool to visualize click streams of
web pages. Plotting the resulting trees with Direct X on a hyperbolic geometry
was my second project for Young Scientists which made me receive the 1st prize in the regional contest and the 2nd prize in the state contest. At this time I
also participated in the National Competition Computer Science where I ranked
among the top 50 nationwide. In the several programming challenges I was able to
get a full score for my submission of the m-Travelling Salesmen Problem which
was my hardest project before I entered university to study mathematics and
physics.
In university, the computer science professor Dr. Schömer quickly discovered my
potential. This lead to me teaching the class "Algorithms II" to my classmates
in my second year at university, instead of taking the course with them. It was
also Prof. Dr. Schömer who gave me the possibility to participate in the NWERC
of ACM's famous ICPC, being the youngest member of our university's team. In
2005, my team scored 22 out of 46 and in 2006 we scored 29 out of 42. Having
progressed in my maths studies, I nevertheless searched for further programming
challenges and therefore joined the research group of Prof. Dr. Schömer. We
worked on packing algorithms to pack circles of different radii. After I had
left the group in order to concentrate on metalcon and my diploma thesis, the
group published a paper stating that they have beaten all known world records
for packing problems containing 1 to 50 circles. This resulted in a lot of media
coverage e.g. in Times Magazine.
As stated above, I decided to deal with my own project metalcon in my diploma
thesis. Along with two friends, I developed metalcon in a time span of 1 ½
years. By now it is the second biggest metal community in Germany. In my diploma
thesis I used data mining techniques and the interests of our users of
metalcon.de in order to calculate which metal bands are similar to each other.
With these results I was able to build a recommendation engine like last.fm
Chess Computer (2002):
At the age of 17 I began to program an artificial intelligence device that is
able to play chess. My program followed the standard techniques of alpha beta
pruning. The smart parts of the project were the data structures that I used to
store the chessboard, in order to be able to calculate possible variations of
the chessboard which you need for the recursive alpha beta pruning.
I introduced 64-bit variables called bitboards, which mapped each bit to a field
of the chessboard. With the 8 bitboards called WHITE, BLACK, KINGS, QUEENS,
ROOKS, BISHOPS, KNIGHTS and PAWNS and boolean operations I could retrieve the
information which figures are placed on the chessboard.
The interesting thing is that I used lookup tables of bitboards to store
information about what kind of moves are possible for a figure from a certain
position. These lookup tables were 2,01 MB in size but allowed me to calculate
about 150'000 chessboards per second in the alpha beta pruning, making my chess
computer almost as fast as commercial products.
m-Travelling Salesmen Problem (2003):
At the age of 18 I participated in the National Competition Computer Science.
One of the exercises was to solve the Travelling Salesman Problem for m
different salesmen being m independent, not crossing journeys. The participants
of the contest were asked to introduce a good value to optimize such a problem
and to provide a solution.
I decided that the best solution for the m-TSP should include journeys which
have about the same length and which are by themselves as short as possible. I
used the Kruskal algorithm to cluster the cities in m clusters. Afterwards I
solved the TSP of each cluster by backtracking if there were less than 10
cities, and otherwise by use of the Christophides heuristics.
Here you can find my solution for 80 cities and 7 journeys:

Clickstream Visualization (2003):
Click stream analysis is used as a data mining technique to get information
about how users surf on a website. In the best case you can find usability bugs
and enhance the user experience of your site by doing a click stream analysis.
When I was 18 years old I wrote an application that was able to visualize a
click stream analysis. The main task was to retrieve the user data from the
database (about 500 GB) and to store them in a tree data structure (assuming
every user would start exploring the site from the homepage). Afterwards my task
was to visualize the interactive tree using MS - DirectX and marking edges
bolder if more users went on that path. Interactive meens, that you could zoom
into any vertex to have a closer look by not destroying the global topolgy. As
you can see from the pictures, I introduced some hyperbolic geometry to do the
visualization in order to use the space of a computer screen in an optimal way.
Teaching (2005):
After Prof. Dr. Schömer discovered my skills in university, I got to teach the
class "Algorithms II" in my second year of university, instead of taking it
myself (while being 20 years old at that time). Apart from teaching, my tasks
included the correction of homework for all students.
Topics included data structures such as stacks, (priority) queues, heaps, trees
(binary, AVL), (doubly) linked lists, hash tables, and algorithms such as quick
sort, merge sort, and graph algorithms (Kruskal, Prim, Bellman Ford, Floyed
Warshall, Dijkstra).
One year later I also taught a class called "Introduction to Programming" where
I provided the students with an introduction to Java.
International Collegiate Programming Contest (2005 & 2006):
I participated twice at NWERC, which is one of the European pre-selection
contests for the international collegiate programming contest run by ACM. In
2005 we scored 22 out of 46 and in 2006 we scored 29 out of 42. In my team, it
was my job to solve the problems on a paper by finding out which algorithms and
data structures could best be used, while someone else was responsible for the
speed coding of an implementation of my solution.
Pack IT (2006):
Pack IT is one of the scientific projects of Prof. Dr. Elmar Schömer. He is
interested in doing research in packing algorithms. His group participated in
the programming contest of AI Zimmermann. The task was to pack n circles with
the Radii 1,2,...,n into one bigger circle, so that the outer circle has minimal
Radius. While most teams tried to solve this problem by using some Monte Carlo
method and spending a lot of calculation time on big clusters, we tried to solve
it algorithmically by contact simulation which could run pretty fast on a normal
notebook computer. Here you can see our solution for n=50:

After I left the Group the results from the contact simulation have been
integrated with a Monte Carlo method and the Group has beaten all known world
records for this problem.
Metalcon (scince 2007):
Together with the speed coder from my ACM team and another friend we programmed
the social network metalcon.de. It is a metal community for metal fans and metal
bands. Interesting tasks that we completed on our way included:
- Implementing our own AJAX Framework
- >
Implementing our own Object-Relational Mapper to map MySQL Tables ( 120) to
PHP-Classes
- Integrating Memcached to our OR-Mapper
- Integrating Smarty to strictly divide programming logic from HTML design
-
Integration of the open source openGeoDB to offer regional information to the
user
-
Introducing data mining (see diploma thesis) to make music recommendations to
our user and to improve the user experience
Diploma Thesis (2009):
For my diploma thesis (grade: 1.3) with the title "Using Data Mining to Improve
User Experience in Web2.0 Networks" I used some version of a k-nearest neighbor
algorithm to calculate which heavy metal bands are similar to each other. After
all, the algorithm is straight-forward, and the tricky part is to choose a good
metric on the vector space generated by the base of all known metal bands.
I introduced a heuristic that could enhance any given metric. The heuristic
would use global aspects of the metric to construct a new one. In order to test
whether the heuristic worked well or not I went over to text mining as a test
case. I used 2000 texts by 30 different editors. All of the text belonged to the
same topic. I assumed that a good metric would bring out a closest neighbor for
any text that is a text written by the same editor. Using an Euclidean metric
(which is known not to be a good metric for these kind of applications) I got
the result that 40-50 % of the closest neighbors had the same author. Using the
standard Tanimoto coefficient or Jaccard Index, the results were about 60-70%.
As you can see from the attached graph with my heuristic (called "1. Iterierte")
I improved the Jaccard metric to get between 70 and 80 % hits.

