PyPaDS: Documentation!

Building on the MLFlow toolset, PyPaDS aims to extend the existing tracking functionality, make logging as easy as possible for the user. The production of structured results is an additional goal of the extension.

Note

A larger update regarding the mapping files was released recently. Mapping files are now based on YML and are more hook centric. Read more about mapping files here.

Install PyPads

Logging your experiments manually can be overwhelming and exhaustive? PyPads is a tool to help automate logging as much information as possible by tracking the libraries of your choice.

Getting started

Learn more about how to use pypads, configuring your tracking events and hooks, mapping your custom logging function and some of the core features of PyPads.

  • Usage example
    Decision Tree Iris classification
  • Mapping file example for Scikit-learn
    A mapping file is where we define the classes and functions to be tracked from the library of our choice. It includes the defined hooks.
  • Hooks and events
    • Events are defined primarily by listeners which are, in our case, hooks. When triggered, the corresponding loggers are called. Logging functions are linked to these events via a mapping dictionary passed to the base class.
    • Hooks help the user to define what triggers those events (e.g. what functions or classes should trigger a specific event).
  • Loggers
    Logging functions are functions called around when any tracked method or class triggers their corresponding event. Mapping events to logging functions is done by passing a dictionary mapping as a parameter to the PyPads class.

The following tables show the default loggers of pypads.

  • Event Based loggers
    Logger Event Hook Description
    LogInit init ‘pypads_init’ Debugging purposes
    Log log ‘pypads_log’ Debugging purposes
    Parameters parameters ‘pypads_fit’ tracks parameters of the tracked function call
    Cpu,Ram,Disk hardware ‘pypads_fit’ track usage information, properties and other info on CPU, Memory and Disk.
    Input input ‘pypads_fit’ tracks the input parameters of the current tracked function call.
    Output output ‘pypads_predict’, ‘pypads_fit’ Logs the output of the current tracked function call.
    Metric metric ‘pypads_metric’ tracks the output of the tracked metric function.
    PipelineTracker pipeline ‘pypads_fit’,’pypads_predict’, ‘pypads_transform’, ‘pypads_metrics’ tracks the workflow of execution of the different pipeline elements of the experiment.
  • Pre/Post run loggers
    Logger Pre/Post Description
    IGit Pre Source code management and tracking
    ISystem Pre System information (os,version,machine…)
    ICpu Pre Cpu information (Nbr of cores, max/min frequency)
    IRam Pre Memory information (Total RAM, SWAP)
    IDisk Pre Disk information (disk total space)
    IPid Pre Process information (ID, command, cpu usage, memory usage)
    ISocketInfo Pre Network information (hostname, ip address)
    IMacAddress Pre Mac address

PyPads

Extensions

  • PaDRe-Pads is a tool that builds on PyPads and add some semantics to the tracked data of Machine learning experiments. See the padre-pads documentation.

Changelog

0.2.3 (2020-06-23)

  • Bump version: 0.2.2 → 0.2.3. [mehdi]

0.2.2 (2020-06-23)

  • Bump version: 0.2.1 → 0.2.2. [mehdi]

0.2.1 (2020-06-22)

New

  • Added changelog to documentation. [Thomas Weißgerber]

  • Plugin system support New: usr: Yaml format for mapping files New: usr: Importlib performance rebuild. [Thomas Weißgerber]

  • Added mapping file yaml support. [Thomas Weißgerber]

    # Conflicts: # .bumpversion.cfg # CHANGELOG.rst # README.DEV.md # docs/conf.py # docs/projects/pypadre.rst # docs/related_projects.rst # poetry.lock # pyproject.toml

Changes

  • Updated Readme’s. [Thomas Weißgerber]

Fix

  • Managing git repository for Ipython notebooks. [mehdi]
  • Removed comment. [Thomas Weißgerber]
  • Updated the doc to include references to other projects. [Thomas Weißgerber]

Other

  • Bump version: 0.2.0 → 0.2.1. [Thomas Weißgerber]

0.2.0 (2020-06-22)

  • Bump version: 0.1.20 → 0.2.0. [Thomas Weißgerber]

0.1.20 (2020-05-19)

  • Bump version: 0.1.19 → 0.1.20. [Thomas Weißgerber]

0.1.19 (2020-05-19)

  • Bump version: 0.1.18 → 0.1.19. [Thomas Weißgerber]

0.1.18 (2020-05-19)

About Us

This work has been developed within the Data Science Chair of the University of Passau. It has been partially funded by the Bavarian Ministry of Economic Affairs, Regional Development and Energy by means of the funding programm “Internetkompetenzzentrum Ostbayern” as well as by the German Federal Ministry of Education and Research in the project “Provenance Analytics” with grant agreement number 03PSIPT5C.