0

I have the following buildspec.yml file

version: 0.2

env:
  parameter-store:
    s3DestFileName: "/CodeBuild/s3DestFileName"
    s3SourceFileName: "/CodeBuild/s3SourceFileName"
    imgFileName: "/CodeBuild/imgFileName"
    imgPickleFileName: "/CodeBuild/imgPickleFileName"
        
phases:
  install:
    on-failure: ABORT
    runtime-versions:
      python: 3.7
    commands:
      - echo Entered the install phase. Downloading new assets to /tmp
      - aws s3 cp s3://xxxx/yyy/test.csv /tmp/test.csv
      - aws s3 cp s3://xxxx/yyy/test2.csv /tmp/test2.csv
      - ls -la /tmp
      - curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python3 -
      - export PATH=$PATH:$HOME/.poetry/bin
      - poetry --version
      - cd ./create-model/ && poetry install
  build:
    on-failure: ABORT
    commands:
      - echo Entered the build phase...
      - echo Build started on `date`
      - ls -la
      - poetry run python3 knn.py

I'm using poetry to manage all my packages. I do not have any artifacts to be used.

This is the content of the knn.py file (or part of it actually)

import pandas as pd

print("started...")

df = pd.read_csv('/tmp/n1.csv', index_col=False)

print("df read...")
print(df.head())

I don't see any errors in the logs. The install phase runs fine, but when the build phase is started, I see that it has invoked the knn.py file. I've waited for almost 30 mins but all I see in the log is "started..."

I dont see any of the print statements in the log. It probably is not progressing any further. I've tried using different AWS Managed images but its still the same result.

This code runs perfectly fine if I run it locally on my machine.

Edit: I tried the advanced build override and I connected to the container using SSM. I installed pandas locally and ran the read_csv() and it worked. However, the command poetry run python3 knn.py from the buildspec.yml is still hanging

sagar1025
  • 616
  • 9
  • 22
  • I would start simplifying `knn.py` to find the issue. It must be hanging on something in the file. Maybe you could start commenting out sections of it, from most "heavy" to the least, starting with `df = pd.read_csv('/tmp/n1.csv', index_col=False)`. – Marcin May 25 '21 at 04:26
  • Is it worth reading the file in chunks? You think it would fix the issue? – sagar1025 May 25 '21 at 12:26
  • I tried to read in chunks but it works fine locally but again, it seems like its not even running the file. All I see from the buildlog is `[Container] 2021/05/25 20:05:57 Running command python knn.py` and that's it. There's no other print statements that are displayed – sagar1025 May 25 '21 at 20:10
  • if you are using codepipeline, in the console, you should be able to click on details and tail the build logs – foobar8675 Aug 23 '21 at 22:13

0 Answers0