A complete walkthrough from data production to model refinement

Image derived from “Alaskan pipeline” by Carol M. Highsmith, Library of Congress, Public Domain

Building an efficient input pipeline is an important performance optimization for training deep neural networks. Beside this, data provisioning needs even to be well structured and transparent do be ruled out as a source of errors for your training. While a lot of current developments are running on PyTorch — Tensorflow is still the way to go if you plan to go to Edge Devices or if want to run on giant training clusters with Terabytes of data. This is where the tf.data API with the tf.data.Dataset jumps in: having an efficient pipeline to provide you with training data which…


A hardware/software guide based on our setup.

‘All-In-One’ —own work

Getting a GPU machine running with recent versions of Cuda, Tensorflow, Pytorch are quite some steps. There are many guides out there, but it is still a gamble to get the right combination of hardware together with the right combination of software. This guide shows one working combination which currently works at our office and does the job — if you want to resemble this or if you want to get clues how we got things running, read on.

We start with the hardware we got and our decisions behind choosing these components…

Florian Hermisch

Computer Scientist, Consultant, Founder of Riitail.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store