Education

10/2022 to 02/2024

M.Sc. Computer Science
Hochschule Furtwangen University
Furtwangen, Germany

Grade: 1.3

Thesis: An empirical evaluation of data lakehouse table formats, advised by Prof. Dr. Lothar Piepmeyer and Dr. Andreas Weininger (IBM)

Research Project: Stream Processing for ROS-based Application Development

03/2024 to 06/2024

Exchange Semester
School of Computing, Korea Advanced Institute of Science and Technology (KAIST)
Daejeon, South Korea

GPA: 3.66

10/2007 to 08/2011

B.Sc. Computer Science (Online Media)
Hochschule Furtwangen University
Furtwangen, Germany

Grade: 1.3

Thesis: The use of Apache Mahout and the MapReduce paradigm to perform cluster analysis of big data on Amazon Web Services, advised by Prof. Dr.-Ing. Wolfgang Maass and Jonas Almeida, PhD (University of Alabama at Birmingham)

Work

05/2015 to 07/2022

Scientific Programmer at Michigan State University, Department of Epidemiology and Biostatistics, QuantGen Lab (Gustavo de los Campos, PhD)
East Lansing, MI

  • Developed scalable, modular, but easy to understand Big Data analysis pipelines for high performance computing (HPC) clusters (Slurm, TORQUE/Moab)

    • Maintained and preprocessed datasets

    • Maintained software installations

  • Provided technical support to lab members

  • Developed and maintained several R packages

    • BGData, a suite of packages for analysis of big genomic data

    • LinkedMatrix, a matrix-like class that links other matrix-like objects by rows or by columns

    • BEDMatrix, a matrix-like class that allows for efficient subsetting of PLINK .bed files

    • symDMatrix, a matrix-like class to represent symmetric matrices partitioned into file-backed blocks

    • crochet, an implementation helper for subsetting and replacement operators of custom matrix-like types

  • Held seminars on high-performance computing (HPC), general-purpose computing on graphics processing units (GPGPU), parallel programming in R (parallel package) and C (OpenMP), R package development, R itself, and other topics

  • Prepared publications and publication-ready plots

03/2011 to 05/2015

Scientific Programmer at University of Alabama at Birmingham, Department of Pathology, Division of Informatics (Jonas Almeida, PhD)
Birmingham, AL

  • Designed, developed and maintained

    • Middleware for Node.js

      • Corser, a CORS middleware, available as an npm package and averaging millions of downloads per week

      • Bounce, a RESTful governance layer for MongoDB

    • Applications for CouchDB, a NoSQL database

    • Single Page Applications (SPA) and Chrome extensions

      • TCGA Toolbox, a modular, browser-based software ecosystem for analysis of health data (Angular.js), including various modules, e.g., for cluster analysis of Reverse Phase Protein Array (RPPA) data from The Cancer Genome Atlas (TCGA)

      • Sparqling, an RDF store in your browser

    • Support libraries and experimental software

03/2009 to 02/2011

Student Research Assistant at Hochschule Furtwangen University, Research Center for Intelligent Media (Prof. Dr.-Ing. Wolfgang Maass)
Furtwangen, Germany

  • Worked on an "Intelligent Bathroom", a use case for the EU-funded Interactive Knowledge Stack (IKS) project

    • Developed a service for detecting the identity of users based on their height using the Microsoft Kinect motion sensing input device (C#)

    • Developed a smart dresser app that suggests matching clothes from a wardrobe after scanning the QR code of an initial clothing item of interest (Android, Groovy, Grails)

  • Worked on a smart web-based service that assists salespeople during lead qualification (HTML, CSS, JavaScript, S3DB)

  • Developed a smart drug app that tracks a user’s medication list and checks for drug interactions when adding a new drug after scanning its QR code (Android, Groovy, Grails)

  • Set up an Apache Hadoop cluster for testing purposes

09/2008 to 02/2009

Intern at CAS Software AG
Karlsruhe, Germany

  • Optimized internal processes to capture and analyze time and attendance data (Excel, VBA)

  • Worked on CAS Campus, a web-based campus management software

    • Implemented an advanced search functionality (ASP, XML, XSLT, HTML, CSS, JavaScript)

    • Developed static code analysis tools (Groovy)

06/2004 to 09/2008

Web Developer at trenovis OHG
Villingen-Schwenningen, Germany

  • Converted Photoshop layouts to HTML, CSS, and JavaScript templates

  • Worked on infoinclude2, an in-house content management system (ColdFusion, MySQL)

  • Provided customer support

Volunteer Work

2013 to 2019

Open Source Developer at AnkiDroid, a spaced repetition flashcard app for Android with over a million users

  • Maintained the frontend and backend that captures bug reports (ACRA, CouchDB)

  • Managed the initial port to Chromebooks using the Android Runtime for Chrome (ARC)

  • Developed a pipeline that automatically publishes changes to the documentation (Travis CI)

  • Added new features

  • Fixed bugs

2014 to 2015

Data Wrangler at Code for Birmingham, a Code for America brigade

  • Maintained public data repositories

  • Worked on data analysis projects (e.g., analysis of traffic accident reports)

Publications

  1. A. Grueneberg, A. Mattes, L. Mendel, and J. Sobott, “Stream Processing for ROS-based Application Development,” informatikJournal, vol. 14.2023, pp. 47–53, 2023, Accessed: Apr. 02, 2024. [Online].

  2. B. D. Valente, G. de los Campos, A. Grueneberg, C.-Y. Chen, R. Ros-Freixedes, and W. O. Herring, “Using Residual Regressions to Quantify and Map Signal Leakage in Genomic Prediction,” Genetics Selection Evolution, vol. 55, no. 1, p. 57, Aug. 2023, doi: 10.1186/s12711-023-00830-1.

  3. G. de los Campos, A. Grueneberg, S. Funkhouser, P. Pérez-Rodríguez, and A. Samaddar, “Fine Mapping and Accurate Prediction of Complex Traits Using Bayesian Variable Selection Models Applied to Biobank-Size Data,” European Journal of Human Genetics, Jul. 2022, doi: 10.1038/s41431-022-01135-5.

  4. A. Gonzalez-Reymundez, A. Grueneberg, G. Lu, F. C. Alves, G. Rincon, and A. I. Vazquez, “MOSS: Multi-Omic Integration with Sparse Value Decomposition,” Bioinformatics, vol. 38, no. 10, pp. 2956–2958, May 2022, doi: 10.1093/bioinformatics/btac179.

  5. A. Grueneberg and G. de los Campos, “BGData - A Suite of R Packages for Genomic Analysis with Big Data,” G3: Genes, Genomes, Genetics, vol. 9, no. 5, pp. 1377–1383, May 2019, doi: 10.1534/g3.119.400018.

  6. M. Behring et al., “Integrated Landscape of Copy Number Variation and RNA Expression Associated with Nodal Metastasis in Invasive Ductal Breast Carcinoma,” Oncotarget, vol. 9, no. 96, pp. 36836–36848, Dec. 2018, doi: 10.18632/oncotarget.26386.

  7. H. Kim, A. Grueneberg, A. I. Vazquez, S. Hsu, and G. de los Campos, “Will Big Data Close the Missing Heritability Gap?,” Genetics, vol. 207, no. 3, pp. 1135–1145, Nov. 2017, doi: 10.1534/genetics.117.300271.

  8. H. Koo, J. A. Hakim, P. R. E. Fisher, A. Grueneberg, D. T. Andersen, and A. K. Bej, “Distribution of Cold Adaptation Proteins in Microbial Mats in Lake Joyce, Antarctica: Analysis of Metagenomic Data by Using Two Bioinformatics Tools,” Journal of microbiological methods, vol. 120, pp. 23–28, 2016.

  9. M. D. Cain, J. R. Siebert, E. Iriabho, A. Gruneberg, J. S. Almeida, and O. M. Faye-Petersen, “Development of Novel Software to Generate Anthropometric Norms at Perinatal Autopsy,” Pediatric and Developmental Pathology, vol. 18, no. 3, pp. 203–209, 2015.

  10. D. E. Robbins, A. Grüneberg, H. F. Deus, M. M. Tanik, and J. Almeida, “TCGA Toolbox: An Open Web App Framework for Distributing Big Data Analysis Pipelines for Cancer Genomics,” in Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics, 2013, p. 62.

  11. D. E. Robbins, A. Grüneberg, H. F. Deus, M. M. Tanik, and J. S. Almeida, “A Self-Updating Road Map of The Cancer Genome Atlas,” Bioinformatics, vol. 29, no. 10, pp. 1333–1340, 2013.

  12. J. S. Almeida et al., “ImageJS: Personalized, Participated, Pervasive, and Reproducible Image Bioinformatics in the Web Browser,” Journal of pathology informatics, vol. 3, 2012.

  13. J. S. Almeida, A. Grüneberg, W. Maass, and S. Vinga, “Fractal MapReduce Decomposition of Sequence Alignment,” Algorithms for Molecular Biology, vol. 7, no. 1, p. 12, 2012.

Poster Presentations

  1. A. Grueneberg and J. Almeida, “A Minimal Governance Layer for the Web of Linked Data,” at CSHALS, Boston, MA, Feb. 2014.

Teaching

09/2016 to 10/2016

R Programming: Essentials and Basics
5-week course parallel to EPI 855 ("Biostatistical Modeling in Genomic Data Analysis") led by Ana I. Vazquez, PhD in the Department of Epidemiology and Biostatistics, Michigan State University

Certifications

06/2010

Basics of Municipal Broadband Networks
Ministry for Rural Area, Food, and Consumer Protection in Baden-Württemberg, Germany

Selected MOOCs

10/2020

Fundamentals of Accelerated Computing with CUDA C/C++
NVIDIA Deep Learning Institute

05/2015

Data Analysis and Statistical Inference
Duke University, Coursera

11/2013

Computing for Data Analysis
The Johns Hopkins University, Coursera

03/2013

Data Analysis
The Johns Hopkins University, Coursera

Fall 2013

Linguistics 102 - Speech Science
The Virtual Linguistics Campus, Philipps Universität Marburg

Fall 2013

Linguistics 101 - Fundamentals
The Virtual Linguistics Campus, Philipps Universität Marburg

12/2011

Machine Learning
ml-class.org, now Coursera

Other

Citizenship

German

Languages

German (native), English (fluent), Korean (basic), Japanese (basic)