Skip to content

Training resources for Accelerating Data Engineering Pipelines on Baskerville.

License

Notifications You must be signed in to change notification settings

baskerville-hpc/data-engineering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Logo

Data Engineering

Training resources for Accelerating Data Engineering Pipelines on Baskerville.

screenshot

About The Project

Training resources for Accelerating Data Engineering Pipelines on Baskerville HPC. This course covers:

  1. Setting up your environment on Baskerville
  2. Data on the Hardware Level with Pandas, cuDF and Dask
  3. Data Visualisation with Plotly
  4. Final Challenge

(back to top)

Getting Started

To take this course, you will need a registered account on Baskerville. Details for requesting access can be found here.

Prerequisites

This course is for beginners, however some familiarity with the following may be beneficial:

  • Python
  • Jupyter notebooks
  • Pandas

(back to top)

License

This work is licensed under a GNU General Public License v3.0. See LICENSE.md for more information.

(back to top)

Contact

Email us: [email protected]

Project Link: https://github.com/baskerville-hpc/data-engineering

(back to top)

Acknowledgments

Baskerville is funded by the EPSRC and UKRI through the World Class Labs scheme (EP/T022221/1) and the Digital Research Infrastructure programme (EP/W032244/1).

(back to top)

About

Training resources for Accelerating Data Engineering Pipelines on Baskerville.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published