Energy efficiency has become a first class citizen in the design of large computing systems. While GPUs and custom processors show merit in this regard, reconfigurable architectures, such as FPGAs, promise another major improvement in energy efficiency, constituting a middle ground between fixed hardware architectures and custom-built ASICs. This tutorial shows how high-level synthesis (HLS) can be harnessed to productively achieve scalable pipeline parallelism on FPGAs. Attendees will learn how to target FPGA resources from high-level C++ or OpenCL code, guiding the mapping from imperative code to hardware, enabling them to develop massively parallel designs. We treat well-known examples from the software world, relating traditional code optimizations to hardware, building on existing knowledge when introducing new topics. By bridging the gap between software and hardware optimization, our tutorial aims to enable developers from a large set of backgrounds to start tapping into the potential of FPGAs for high performance codes.