ASL FAQs
Course Organization and Learning
- How do I study the material beyond just reading the slides?
Many slides contain links to background material for further reading. For specific technical terms and questions, answers also can often be found on the web.
- How do I train for the exam?
You can train by solving prior exams and homeworks, all of which are available on the website. However, note that the material has also slowly evolved and is occasionally updated to new processor lines.
- How is the project graded?
Note that the project should be essentially finished by the time of the final presentations. The main basis for the grade is the final report, augmented with the information we got from the presentation, the one-on-one meetings, and the code in your repository. However, the report is the main basis for the grade, so your meetings can be very good, but that does not help if the report is bad.
Grading projects is never a precise science, but we do it carefully in a meeting where all projects are discussed and compared. Besides performing various runtime optimizations it is important to be also analytical, i.e., try to reason about and understand measured effects on runtime performance and the limits of what is possible. The project system provides a step by step guide. We understand that every project is different and that some techniques or analyses may not apply.
We may reduce the grade for team members that did not contribute properly. If this happens, we make sure that the other team members are not disadvantaged. The project grade (as points out of 100) is communicated after the grade conference upon request.
- Can I use C++ in my project?
Yes, of course. C++ can be a good choice for the infrastructure. The performance-critical parts, however, you will need to write in C (possibly with SIMD intrinsics) to apply what was learned in class.
- Can I use Euler for the project?
You can use any machine for the project. However, cluster machines may have other load which makes measuring difficult. Thus, we recommend to use your own machines for the project. In case the members in a team have different processors/architectures, you can pick one or two. Most optimizations will be similar. If something becomes processor specific, you can show this in the results.
- If I repeat the course, can I take part of the grade (e.g., the project) with me?
No (ETH regulations).
- Is there a second opportunity in case I cannot attend the midterm?
No. If you miss the midterm you lose the points. But we always announce the date in the beginning of the semester so you can plan.
- What if I need my grade earlier?
The final grade is communicated, as usual, after the grade conference. If you need to know earlier if you passed (e.g., to start Master thesis), please contact Ms. Spicher (study administration) for help. If appropriate she will then ask us. The points for homeworks and the midterm are communicated during the semester; the points for the project after the grade conference upon request.
Course Content
-
How do I disable TurboBoost on OS X? (no guarantees for the below)
- Turbo Boost Switcher
- How do I disable AMD Precision Boost? (no guarantees for the below)
- AMD Ryzen Master Utility
- Clocktuner for Ryzen
Theese tools do not turn precision boost off, but they let you control the frequency of the cores.
- Why is it necessary to disable Intel Turbo Boost to get precise measurements, when using RDTSC?
The TSC may run on a different core than your code. If these cores run a different frequencies you get inaccurate timings.
- How do exp(x), sin(x), etc. count in the flopcount of an algorithm?
First, these functions are implemented differently in different libraries. Second, the actual flop count and runtime can depend on the input. In most computations, these functions are not used as often as adds and mults and thus can be ignored, If not a value of about 30 can be reasonable.
- Is the FMA (fused multiply–add) instruction considered as one floating point operation, or two floating point operations?
In the flop count (used to compute performance) we always count the mathematical ops performed, so 2 for FMA. In the processor it is done as one instruction (if supported).