3

I'm wondering how to make PyCharm's VCS (i.e. Git) work with Jupyter Notebook files. Changing even 1 loc results in 3 modifications detected during commit: enter image description here Sorry if it's a duplicate, but I haven't found anything similar.

Anyky Beniky
  • 169
  • 1
  • 5

2 Answers2

3

Well, I wouldn't say that the current support of Jupyter Notebook files versioning doesn't work at all. You can see it on your own screenshot that your changes are detected. We don't parse all of the changes to detect only the source code changes. And even if we did, many people actually want to track the output because, for example, in data science, the results are not always reproducible and you may want to keep track of the output as well as src.

Although it can be enhanced with the implementation of the following functionality https://youtrack.jetbrains.com/issue/PY-20132 that would allow committing all of the changes, but see only changes of source code, so feel free to upvote and leave comments.

Sergey K.
  • 1,855
  • 9
  • 12
2

I use Pycharm Community edition. I love the way Pycharm integrates with git and its VCS shows the diffs visually. However, for jupyter notebook files, the diff is difficult to track visually. Running a cell introduces various changes.

Notebook files normally diff like text files. I use a simple method to enhance the visual quality. I created a new file type Settings>Editor>File Types for *.ipynb files. I enable matching for all types of brackets. I add few keywords,

Keyword 1:

"outputs"
"source"

Keyword 2:

"code"
"markdown"

This highlighted format shows up in the Pycharm VCS and enables us to easily locate changes in code and markdown cells and outputs. An example of this effect is shown in this screenshot. Now, we don't need to worry about changes in the execution count or meta data.

  • Welcome to SO! FYI https://intellij-support.jetbrains.com/hc/en-us/community/posts/360004262159-Pycharm-version-control-of-Jupyter-notebooks-?page=1#community_comment_360000626420 – tafaust Apr 20 '20 at 09:58
  • Could you provide how exactly did you configured this custom File Type to ignore shings such execution_time, or output of colums. I have added keywords as you mentioned but it doesn't change the behaviour of diff in PyCharm. – Mateusz Dorobek Apr 22 '22 at 12:45