0

I wrote a script with fast-csv that can read an excel file in amazon s3 and then take the data and store it in mySQL. I now have an ec2 instance set and created a folder titled "upload" and housed the CSV file in there. My question is how do I read a file in the ec2 instance as oppose to the s3 bucket? Below is current script using

  const s3Stream = s3.getObject(params).createReadStream()
    stream = require('fast-csv').parseStream(s3Stream, {
        headers: true, skip_blanks: true
    })
        .on("data", data => {
            dataArr.push(data);

        })
    stream = require('fast-csv').parseStream(s3Stream)
        .on("data", data => {
            dataArr2.push(data);
        })

        .on("end", () => {

            let csvStream = csv
                .parse({ ignoreEmpty: true })
                .on('data', function (dataArr2) {
                    myData.push(dataArr2);

                })
                .on('end', function () {
                    dataArr2.shift();

                    console.log('dataArr2 ' + myData)


                    if (dataArr.length > 0) {

                        let columnsIn = dataArr[0];

                        for (let key in columnsIn) {
                            headerDatas.push(key)

                        }
                        for (let key in columnsIn) {
                            orginalHeaderDatas.push(key)
                        }

                        for (i = 0; i < headerDatas.length; i++) {
                            newData = headerDatas[i].split(' ').join('_');
                            correctHeaderFormat.push(newData)
                        }


                        // Assigns approriate Sql property to headers
                        let databaseId = headerDatas[0].split(' ').join('_');
                        let leaseDiscription = headerDatas[1].split(' ').join('_');
                        //Removes Headers that are not DEC propertys 
                        headerDatas.shift();
                        headerDatas.shift();

                        let newdatabaseId = databaseId + ' int(25) NOT NULL'

                        let newleaseDiscription = leaseDiscription + ' varchar(255) NULL'

                        //adds property to the end of the remaining headers in array
                        for (i = 0; i < headerDatas.length; i++) {
                            newData = headerDatas[i].split(' ').join('_') + ' dec(25,2) NULL';
                            updatedData.push(newData)
                        }

                        //Adds headers that were removed from array and primary key to updated array
                        let key = 'PRIMARY KEY (Database_ID)'
                        headersWithProperties.push(updatedData)
                        headersWithProperties.unshift(newleaseDiscription)
                        headersWithProperties.unshift(newdatabaseId)
                        headersWithProperties.push(key)
                    } else {
                        console.log('No columns');
                    }

                    // open the connection
                    connection.connect((error) => {


                        if (error) {
                            console.error(error);
                        } else {

                            let createTable = 'CREATE TABLE `CD 1`' + '(' + headersWithProperties + ')'
                            let insertData = 'INSERT INTO `CD 1` ' + '(' + correctHeaderFormat + ') ' + 'VALUES ?'



                            //create table
                            connection.query(createTable, (error, response) => {
                                console.log("bottom" + connection.query)
                                console.log(error || response);
                            });

                            //insert data
                            connection.query(insertData, [dataArr2], (error, response) => {
                                console.log("bottom" + connection.query)
                                console.log(error || response);
                            });

                        }

                    });
                });

            stream.pipe(csvStream);
        });
mruanova
  • 6,351
  • 6
  • 37
  • 55
ddobson001
  • 91
  • 1
  • 5

1 Answers1

0

If I understand your question correctly you are trying to read the csv files that are local (in the same place as your node.js and mysql) instead of from a S3 bucket. Don't use the s3 variable to get the csv file, rather you should read it locally.

fs.createReadStream('/path/to/upload/data.csv')

and then you can parse it into the mysql database using a similar method as before. It would look something like this

Yehuda Clinton
  • 375
  • 4
  • 12
  • Ok so if my understanding is correct the ec2 creates a virtual copy of my desktop space so really I just need to have the path read locally? Sorry I know I sound like an idiot just new to this. – ddobson001 Nov 01 '19 at 17:04
  • Its saying that no such file or directory, open 'C:\path\to\' it keeps referencing the C drive not the instance in which the folder was created so the path in ec2 is /home/ec2-user/upload – ddobson001 Nov 01 '19 at 17:49
  • I think you might be confusing Linux and Windows. Linux is like you said /home/ec2-user/upload – Yehuda Clinton Nov 03 '19 at 00:15