# How to Write Fast Numerical Code - Spring 2016

# Basic Information

- Course description, goals, integrity, knowledge base
- Course number: 263-2300, 6 credits
- Spring 2016, lectures: M 10:15-12:00, HG D3.2; Th 9:15-10:00 CAB G51; occasional substitute lectures: W 13:15-15:00 HG D3.2
**Instructor**: Markus Püschel**TAs**:- Alen Stojanov
- Gagandeep Singh
- Only for project supervision: Daniele Spampinato
- Only for project supervision: Georg Ofenbeck

# Grading

**40% research project****25% midterm****35% homework****There is no final Exam**

# Research Project

# | Title | Supervisor/s |
---|---|---|

1 | Optimal binary search trees | GO |

2 | Short range molecular dynamics | MP |

3 | Edmonds Karp algorithms | MP |

4 | K-means clustering | AS, GS |

5 | Classification using Discriminative Restrictive Boltzmann Machines | DS |

6 | Elliptic curve cryptography | AS, GS |

7 | Transitional Markof Chain Monte Carlo | AS, GS |

8 | Coresets for Soft Clustering with Bregman Divergences | MP |

9 | Restarted GMRES on Frank matrices | DP |

10 | KAZE feature detection on ARM | AS, GS |

11 | Viterbi for spelling correction | GO |

# Midterm

20. April, 13:15 - 15:00, HG E5 (without solution, with solution).

# Homework

Homework | Deadline | Solution |
---|---|---|

Homework 0 | as soon as possible | |

Homework 1 | March 11th, 5pm | Solution |

Homework 2 | March 18th, 5pm | Solution |

Homework = work on project | April 4th, 5pm | |

Homework 3 | April 11th, 5pm | Solution |

Homework = work on project | April 27th, 5pm | |

Homework 4 | May 12th, 5pm | Solution |

# Lectures (including pdfs)

Date | Content | Notes | Other |
---|---|---|---|

22.02 | Course motivation, overview, organization | ||

25.02 | Recap asymptotic analysis, cost and performance analysis | ||

29.02 | Intel Core Microarchitecture, compute/memory bound | Intel Optimization Manual | |

03.02 | Super scalar processors, instruction-level parallelism | ||

07.03 | cancelled, moved to W 09.03. | ||

09.03 | benchmarking, compiler limitations | ||

10.03 | compiler limitations, caches | ||

14.03 | cancelled, moved to W 16.03. | ||

16.03 | [caches, blocking MMM] | notes | |

17.03 | cancelled | ||

21.03 | Roofline model | notes | paper |

24.03 | dense linear algebra software, ATLAS | ||

04.04 | optimizing MMM, model-based ATLAS | notes | |

07.04 | optimizations related to register renaming and virtual memory | notes | |

11.04 | Memory bound computations, sparse MVM | ||

14.04 | Sparse MVM continued, SIMD vectorization | ||

18.04 | cancelled (Sechseläuten & exam W 20.4.) | ||

20.04 | Midterm exam (HG E5) | ||

21.04 | SIMD vector extensions | Intel Intrinsics Guide | |

25.04 | SIMD vector extensions | ||

28.04 | Linear transforms and algorithms | notes | |

02.05 | Fast Fourier transform | notes | |

09.05 | Optimizing FFT, FFTW | notes | |

12.05 | cancelled | ||

19.05 | Autotuning and machine learning | ||

23.05 | Program generation for performance (Spiral) | ||

26.05 | cancelled | ||

30.05 | cancelled, moved to W | ||

01.06 | project presentations | ||

02.06 | project presentations |