0

I am using pyGithub to go through the files in the Github repository. The problem is, with this code my_code.get_contents(""), it goes through each and every file in all the folders and subfolders in the repo. Is there a way to make this code efficient. I am only interested in parsing the .csproj files and the packages.config files where they are found. But these files are scattered in multiple places.

from github import Github
import pathlib
import xml.etree.ElementTree as ET

def processFilesInGitRepo():
  while len(contents)>0:
    file_content = contents.pop(0)
    if file_content.type=='dir':
      contents.extend(my_code.get_contents(file_content.path))
    else :
       path=pathlib.Path(file_content.path)
       file_name=path.name
       extention=path.suffix
       if(file_name=='packages.config'):
          parseXMLInPackagesConfig(file_content.decoded_content.decode())
          
       if(extention=='.csproj'):
          parseXMLInCsProj(file_content.decoded_content.decode())  
  
       print(file_content)


my_git=Github("MyToken")


my_code=my_git.get_repo("BeclsAutomation/Echo65XPlus")
contents=my_code.get_contents("") #empty string i.e. ("") gives all the items in the Repository. But can I specify some kind of a search term here saying I need only .csproj and packages.config files.

processFilesInGitRepo()
nikhil
  • 1,578
  • 3
  • 23
  • 52
  • 1
    It seems manually iterating over the files and check for specific pattern is the only way. The API that [`get_contents`](https://docs.github.com/en/rest/repos/contents?apiVersion=2022-11-28#get-repository-content) internally uses does not allow you to filter the files based on patterns. – Abdul Niyas P M Sep 01 '23 at 07:18
  • I understand. Thank you. Do you happen to know any other libraries for instance pythonGit has these capabilities? – nikhil Sep 01 '23 at 08:00

0 Answers0