Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filename collisions on osx hfs+ filesystem #52

Open
jchodera opened this issue Oct 26, 2022 · 3 comments
Open

Filename collisions on osx hfs+ filesystem #52

jchodera opened this issue Oct 26, 2022 · 3 comments
Labels
bug Something isn't working

Comments

@jchodera
Copy link
Member

The osx default filesystem (HFS+) is case-insensitive, which means the decision to use filename case in naming the des370k/SDFS/ and writing out individual files using SMILES strings instead of a single multi-molecule SDFs causes filename collisions and the repository cannot be properly checked out:

warning: the following paths have collided (e.g. case-sensitive paths
on a case-insensitive filesystem) and only one from the same
colliding group is in the working tree:

  'des370k/SDFS/C1CCCCC1.sdf'
  'des370k/SDFS/c1ccccc1.sdf'
  'des370k/SDFS/C1CCCNC1.sdf'
  'des370k/SDFS/c1cccnc1.sdf'
  'des370k/SDFS/CC1CCCCC1.sdf'
  'des370k/SDFS/Cc1ccccc1.sdf'
  'des370k/SDFS/OC1CCCCC1.sdf'
  'des370k/SDFS/Oc1ccccc1.sdf'

As a resolution, I repeat my previous suggestion that this should be a single multi-molecule SDF file where all SDFs are collated and titled appropriately within the file.

@jchodera jchodera added the bug Something isn't working label Oct 26, 2022
@peastman
Copy link
Member

If you want to convert them to a single file, that would be fine.

@peastman
Copy link
Member

One point to keep in mind, of course: a major purpose of this repository is to memorialize exactly how we created the dataset. If we replace the files, and change the script accordingly, they will no longer match how we created the dataset. Granted that the existing script only works on Linux. But that's the script it was created with.

@jchodera
Copy link
Member Author

Of course. We've memorialized that in the release that was cut. That's the record of what we used to create the dataset.

Can we document this as a known bug in the release notes and avoid this practice in future? If we intend to keep adding to this repo, we can also fix the bug or else we will keep getting this error in future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants