Review of plot_column_values() in draw.py
-
No paths hardcoded, make them a variable (e.g.1). We would like to be able to run this code for any file in any folder.
-
Include docstrings and specify types. Explain the purpose of each parameter. Here it's not clear to me what
keep_test_mouseis for.
Example:
def add(a: int, b: int) -> int:
"""Add two integers.
Args:
a (int): First integer.
b (int): Second integer.
Returns:
int: The sum of `a` and `b`.
"""
- Here 'nothing' should become a global variable, you can include it before all the functions after the import section:
FILLNA_BEHAVIOR = 'nothing'
..
phase_data['manual_annot'] = phase_data['manual_annot'].fillna(FILLNA_BEHAVIOR)
-
If I got this correctly, this function does 2 things: 1) fillna in the manual annotation column 2) plots x and y position over time. Can we split this function into 2 functions? And can we use 'draw.py' only for the plotting functions and create another file for data cleaning / data preprocessing? It would be cool to start thinking of a "pipeline"
-
Again on this: it's not clear to me why we need to create subset of the whole dataset (e.g.
data = data[cols]andphase_data = data[data['phase'] == phase]. Can you avoid it and just select the columns you need? -
I would review the other functions below in terms of:
- docstrings, types and description of variables and function
- keep function short and constrained to one single step, when possible
- avoid hardcoding parameters