Projects

Studying Makefiles

Motivation

Makefiles are used in a huge number of open source projects. However, writing a Makefile, or generating one with some other tool, is a very error prone activity. Building an application from source is generally regarded as embarrassingly parallel. However, Makefile errors prevent paralellization. This project entails analyzing and fixing Makefiles, and surveying their use in open source projects.

Goals

  1. Create a library for a real language (ocaml, Haskel, python, though using the Makefile::Parser::GmakeDB perl library for reference is an option) that understands the Makefile db format.
  2. Using the library, create a tool that analyzes Makefiles to see if they will work with the -j switch. In particular, make sure the listed dependencies are complete by copying only the listed dependencies into a separate directory tree, and run the recipes there. Generate user friendly error messages that explain what dependencies are missing.
  3. Catalog common errors that prevent -j working correctly. E.g. missing dependencies, building archives piecemeal, etc.
  4. The tool should attempt to fix errors that are easy to fix, only bothering the user with tricky or severe errors.
  5. Do a study of all Makefile-based projects in a Linux distro to determine what the empirically best -j argument is given a few different hardware configurations. (How many Makefiles total? How many break w/ -j out of the box? etc.)
  6. Can the best -j argument be inferred from some structure in the Makefile and the available hardware?
  7. On a machine with unlimited cores, does some property of the Makefile give a hint as to where increasing the -j argument will see diminishing returns. (Building in a RAM disk to avoid disk IO...) Possibly look at distcc to run compile jobs on a cluster to simulate this.

Background

Ocaml, Haskel, or Python; Makefiles

Projects