Conversation
saraheisenach
left a comment
There was a problem hiding this comment.
Nice work :) ! Main things that I think need fixing are:
- Formatting seems to be messed up - a .flake8 file was added and I think maybe that's messing up the pre-commit formatting that should be in place. Try resetting up pre-commit and deleting .flake8 and running all the files through it
- You accidentally (I think) deleted some files, so those need to be added back in
- A few code changes and moving some stuff around, but aside from that it looks great!
This comment was marked as resolved.
This comment was marked as resolved.
saraheisenach
left a comment
There was a problem hiding this comment.
Thanks for making those changes! A few things:
- There are still some comments from last week that you may have missed. I also added a couple new ones
- Did you set up the pre-commit with the config in the repo so that it runs before you commit and is that working correctly? I noticed a couple instances where the code style had changed but you didn't add anything, so just wanted to see if that was working properly for you
|
Hi Sarah, looks like I glossed over a couple of things--sorry about that.
|
saraheisenach
left a comment
There was a problem hiding this comment.
Great job! I only see a few small bug fixes and then a couple places where we can remove some unnecessary parameters
|
A note on GetY dimensionality bug for documentation purposes
|
matdeeplearn/common/data.py
Outdated
| def get_dataset( | ||
| data_path, target_index: int = 0, transform_type="GetY", large_dataset=False | ||
| data_path, | ||
| transform_list: list = [], |
There was a problem hiding this comment.
Do we want this to be
| transform_list: list = [], | |
| transform_list: list = ["GetY"], |
This goes back to the comment where we were discussing the GetY behavior. I am under the impression that if GetY isn't included as a transform, somewhere within our code, there will be a failure because y is used. If this is the case, do we want to throw an error here if GetY isn't included? I think in the future it would make sense to change how we approach y vs other targets, but for now, we are dependent on using y, so I think it would fail, but double check, because maybe I'm wrong
There was a problem hiding this comment.
I think it should remain transform_list: List[dict] = []. Since we instantiate GetY with an argument, and possibly more in the future, the assertion makes more sense for now as the user should specify GetY.
| r=cutoff_radius, | ||
| n_neighbors=n_neighbors, | ||
| edge_steps=edge_steps, | ||
| transforms=dataset_config.get("transforms", []), |
There was a problem hiding this comment.
See my other comment above about maybe making the default list to be this
| transforms=dataset_config.get("transforms", []), | |
| transforms=dataset_config.get("transforms", ["GetY"]), |
| dataset = get_dataset(dataset_path, target_index) | ||
| dataset = get_dataset( | ||
| dataset_path, | ||
| transform_list=dataset_config.get("transforms", []), |
There was a problem hiding this comment.
If you decide that GetY is needed in order to avoid errors, then I think this should instead be
| transform_list=dataset_config.get("transforms", []), | |
| transform_list=dataset_config.get("transforms", ["GetY"]), |
There was a problem hiding this comment.
Same as I commented above, the issue is that the transforms list holds dictionaries with parameters. So it would be necessary to specify this in the config.
|
In both |
b1f3b86 to
97ee227
Compare
common/transforms.pyvia decorator and specify an arbitrary list of them to compose via the config file. These are applied during preprocessing.