Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add projects/* exception for gitignore filter #1093

Merged
merged 1 commit into from
Mar 31, 2024

Conversation

similato87
Copy link
Collaborator

This hotfix addresses issue #1090, where files expected to be visible in the file selection process were missing. The issue stemmed from a recent addition to our Git filtering logic, designed to streamline file management but inadvertently excluding all files within the "projects/*" folder—a common location for users to store temporary project files.

Recognizing the need to maintain both the convenience for users storing projects in the projects/ folder and the requirement for GPTE developers to avoid unnecessary file uploads, we introduced a specific exception within our filtering logic. This adjustment ensures the projects/ folder remains unaffected by the filter, thus preserving file visibility.

With this change, the problem highlighted in file_selector.toml, where no files were displayed, is effectively resolved. Users can now see all pertinent files under the running directory, ensuring smooth and uninterrupted access to their project files.

@similato87 similato87 self-assigned this Mar 31, 2024
Copy link

codecov bot commented Mar 31, 2024

Codecov Report

Attention: Patch coverage is 0% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 84.29%. Comparing base (28caf0e) to head (025d5f3).

Files Patch % Lines
gpt_engineer/applications/cli/file_selector.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1093   +/-   ##
=======================================
  Coverage   84.29%   84.29%           
=======================================
  Files          27       27           
  Lines        1566     1566           
=======================================
  Hits         1320     1320           
  Misses        246      246           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ATheorell ATheorell merged commit 67e8a3f into main Mar 31, 2024
7 checks passed
@ATheorell
Copy link
Collaborator

I realized I should have thought and asked more before merging. My question is this:

I thought that the filtering for the FileSelector would be using the .gitignore in the working directory of the user, not the .gitignore of gpt-engineer, which ignores the project folder. In that case, I don't see how this is causing #1090 What am I misunderstanding @Styren @similato87

@similato87
Copy link
Collaborator Author

Hi Axel,

I located the area by investigating the filter_by_gitignore function (see link). I thought the cause was that files were being excluded due to our .gitignore settings, specifically line 59 (see link). After removing this line and conducting a test, the previously missing files were successfully included. I thought it might be better to add exception cases before entering the filter function rather than making adjustments in the filtering process so I made this change.

@Styren
Copy link
Collaborator

Styren commented Apr 3, 2024

If I understand correctly it will now ignore gitignore if any dir in the project path contains the part projects/, is that right? This seems like potentially a quite inconvenient footgun should anyone work with a directory structure that isn't this specific projects/-dir.

Wouldn't it be better to configure this somewhere in .gpteng or even placing some empty token file such as .gpt-nofilter in the dirs that should be exempt?

@ATheorell
Copy link
Collaborator

ATheorell commented Apr 4, 2024

Hi Axel,

I located the area by investigating the filter_by_gitignore function (see link). I thought the cause was that files were being excluded due to our .gitignore settings, specifically line 59 (see link). After removing this line and conducting a test, the previously missing files were successfully included. I thought it might be better to add exception cases before entering the filter function rather than making adjustments in the filtering process so I made this change.

@similato87 Thanks to your debugging and with the info that the fix solves the issue, it seems we are now quite certain that the problem indeed relates to line 59 in our gitignore. What I'm raising here is that, if that is the case, the "filter_by_gitignore" function is reading the wrong gitignore file. It should not read the .gitignore of gpt-engineer, it should read the .gitignore of whatever project the user is trying to improve.

@Styren
Copy link
Collaborator

Styren commented Apr 4, 2024

Hi Axel,
I located the area by investigating the filter_by_gitignore function (see link). I thought the cause was that files were being excluded due to our .gitignore settings, specifically line 59 (see link). After removing this line and conducting a test, the previously missing files were successfully included. I thought it might be better to add exception cases before entering the filter function rather than making adjustments in the filtering process so I made this change.

@similato87 Thanks to your debugging and with the info that the fix solves the issue, it seems we are now quite certain that the problem indeed relates to line 59 in our gitignore. What I'm raising here is that, if that is the case, the "filter_by_gitignore" function is reading the wrong gitignore file. It should not read the .gitignore of gpt-engineer, it should read the .gitignore of whatever project the user is trying to improve.

Actually it will aggregate the .gitignore files from the current dir up to the git project root, so the same behavior that you would get if you used git normally. Whether that is what we want is up for debate, but I aimed at keeping the behavior as close to that of git as possible so there's minimal confusion as to how it would impact the file selection list.

One quick/easy solution could be to initialize the projects as an empty git repo?

@ATheorell
Copy link
Collaborator

Ah, now I think I understand the problem. If we imagine this structure:

gpt-engineer
|.gitignore
|
projects
| |_ user_project_root (path on the command line)

2 things can happen:

  1. user_project_root is a git root repo with a gitignore. In this case, the correct gitignore file is read.
  2. user_project_root is not a git repo, in which case gpt-engineer will search for the git root in parent directories, finding the gpt-engineer .gitignore.

Is this what is happening @Styren ?

I think we want to change behavior to never search for gitignore files in folders that are above user_project_root in the file hierarchy. Am I missing something here?

@Styren
Copy link
Collaborator

Styren commented Apr 7, 2024

Ah, now I think I understand the problem. If we imagine this structure:

gpt-engineer |.gitignore | projects | |_ user_project_root (path on the command line)

2 things can happen:

  1. user_project_root is a git root repo with a gitignore. In this case, the correct gitignore file is read.
  2. user_project_root is not a git repo, in which case gpt-engineer will search for the git root in parent directories, finding the gpt-engineer .gitignore.

Is this what is happening @Styren ?

I think we want to change behavior to never search for gitignore files in folders that are above user_project_root in the file hierarchy. Am I missing something here?

Yes that's the behavior, but that's also the standard and expected behavior of git. Propose for example that we are in a js/ts monorepo, we might have a global .gitignore with node_modules, dist, and other standard stuff together with a local for each project, for example to ignore assets or specific build artifacts. If we change this to only use the "top-most" .gitignore then we might suddenly reintroduce a lot of unwanted files into the edit list purely based on the fact that our implementation diverges from the way git normally does it.

We could do the change as you propose it, but I'm not sure I see a good reason for it. Especially if it's for the purpose of fixing the file list for the projects dir.

@ATheorell
Copy link
Collaborator

@captivus @TheoMcCabe do you have some smart input here?
I acknowledge the problems that @Styren sees in only looking at the top level .gitignore. At the same time, also taking into account that many gpt-engineer users are not experienced software engineers, the current behavior implies that .gitignore files, unrelated to the project that the user is working in may be deployed for the file selection, something that is reportedly already causing problems.
Last but not least, making the repo a git repo by default has been proposed and rejected in discussions before, though, maybe we should consider it again.

@similato87 similato87 deleted the hot-fix-for-missing-current-files branch April 25, 2024 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants