data-pipeline contains notebooks and functions to streamline data processing from raw IFCB format to CNN-labeled and validated datasets. These tools are compatible with Azure blob storage and PIVOT but can also be used on locally stored data.
utopia-pipeline-tools contains dictionaries, functions, and classes implemented in the data-pipeline notebooks. This code is packaged and registered on PyPi so it can be installed with pip.
PIVOT is a python-based web application developed by Data Science MS students at the University of Washington. PIVOT provides a way to quickly validate the classification results of our CNNs.
ifcb-tools contains code to prepare IFCB data for EcoTaxa. This repository has been forked from the OceanOptics repository developed by researchers at UMaine. ifcbUTOPIA’s pipeline from raw to classified data involves running programs stored in ifcb-tools.
ml-workflow is dedicated to CNN development and storage. These are the most current versions of the CNNs we use to classify IFCB images into 10 broad taxonomic groups.
plankton-CNN-DEMO, a demo that provides a small-scale example of machine learning-based phytoplankton image classification.
model-development-archive, a repository containing work completed by Emmett Culhane in 2019-2020. This code was a precursor to our current development.