Homework Projects

Master thesis

For my Master thesis I worked in Prof. Dr. Julia Vogt’s Medical Data Science Group on administrative claims patient data. The goal of the project was to find better representation for diseases annotated by the ICD standard. We used co-occurrence information with the help of Word2Vec and hierarchical information with the help of HyperE. Our model then employed euclidean, hyperbolic machine learning using PyTorch and graph neural networks using PyTorch geometric.

Homework projects

During the Masters degree there are many projects that had to be done for the different courses, here is a selection of those:

Advanced machine learning

A part of the course advanced machine learning are the four tasks that have to be completed:

Advanced systems lab project

The objective of the advanced systems lab is to design, write and evaluate fast C code. As a test of our abilities we have to write a project. Our group decided on the Baum-Welch algorithm, which is a special case of the expectation-maximization algorithm of hidden Markov models. As part of the project, our group reordered the steps of the EM-algorithm, unrolled loops, inserted SIMD instructions and checked the performance in valgrind.

Computational intelligence lab project

One main focus of the computational intelligence lab is how to model data (images, text, etc.). Part of the course is a project, where we chose the task of sentiment analysis of tweets. We tried out BERT, ALBERT of different sizes and also lexical normalization with MoNoise.

Machine learning for healthcare

The course machine learning for healthcare reviews most relevant methods and applications of machine learning in biomedicine. To get hands-on experience we did some projects.

Partisan responses

For the course Sequencing Legal DNA we had to do a course project. Our group decided on the big project of generating partisan responses based on U.S. congressional speeches.

The subtask for this project were:

Since that pipeline did not work out as hoped and to baseline, we also tested the text generation capabilities of GPT-2.

Notes

In my second semester it dawned on me, that I could keep the notes of the mandatory lecture on computer instead of my bad handwriting.

Big data

There are many mandatory readings for the course big data.

Here is a selection of readings:

Learning group

The natural language lab of ETH has a weekly learning group that discusses interesting papers.

Here are some that I have read:

Disease representations

For the Master thesis I needed to read some papers for research. Some of those papers have notes in this folder.

Some of the papers I have studied: