1

I find that there's very little documentation on how to extract SAP tables into R. I'm not talking about SAP HANA.

Currently, it's very troublesome that I need to manually extract SAP tables using a GUI interface, export them into tabular format. Then only I can import them using my R script.

The current solution I'm exploring is to have my SAP colleagues to export those SAP tables into SQL database, then I can query the tables from R.

Ideally I want to cut this seemingly unnecessary step of having the SAP tables exported into a database.

Sandra Rossi
  • 11,934
  • 5
  • 22
  • 48
Afiq Johari
  • 1,372
  • 1
  • 15
  • 28
  • Have you tried to look for similar questions: [0](https://stackoverflow.com/questions/59868494/importing-sap-tables-to-r), [1](https://stackoverflow.com/questions/53538073/effective-way-to-write-table-to-sap-hana-from-r/53708395#53708395), [2](https://stackoverflow.com/questions/56181897/how-to-import-a-table-from-hana-to-r-using-odbc), [3](https://stackoverflow.com/questions/42122719/read-row-store-sap-hana-data-to-r-using-odbc-connection) or [this](https://blogs.sap.com/2019/04/09/machine-learning-with-sap-hana-from-r/)? RODBC does not require HANA – Suncatcher Feb 05 '20 at 08:23
  • @Suncatcher the required tables are not from HANA, it's from SAP ECC. – Afiq Johari Feb 07 '20 at 04:57

2 Answers2

1

For SAP R/3 systems (or what you call ECC), your best bet would be executing remote function calls (i.e. RFC).

Normally these would be supported by open source interfaces for at least the more recent versions (e.g. 4.6 or above).

However, they are fairly scarce and I know only of one such implementation in R - this is the RSAP. You'd also need to download NW RFC SDK, and there may be further requirements based on your OS (e.g. what Visual C++ you'd need for Windows, etc.).

There's also a slightly more widely recognised equivalent in Python, the PyRFC.

On the other hand, you may try Robotic Process Automation (RPA) to interact with GUI in an automated way. One of the options is UiPath but there are others. This way you could configure the automation of table extraction - at the same time you can also call R scripts directly from the RPA.

Overall - to be honest - the solution with extracting tables into a separate database does seem to be the best alternative (compared to what I've described above).

Note: The above presumes that - for any reason, usually security - you cannot access the database underlying ECC directly through ODBC calls - otherwise the instructions for connecting and calling SQL from R are the same as for HANA or similar.

arg0naut91
  • 14,574
  • 2
  • 17
  • 38
  • Thanks for sharing about RSAP and PyRFC. My concern with having a separate database is data duplication. It's also possible that end users will refer to latest data from SAP while my models will be based on lagged data as the database, unless I require the database to always be an exact reflection of the data in SAP. – Afiq Johari Feb 05 '20 at 02:26
  • Yeah, it depends on how up to date your reports need to be. If it is daily, you could have a master dump scheduled every night, if it is hourly I imagine your SAP colleagues may not be so happy about it. Note that the above presumes you cannot access the database underlying ECC directly through ODBC - if this is not the case, then you can connect to it through R similarly as for HANA. – arg0naut91 Feb 05 '20 at 10:19
  • What do you mean by 'database underlying ECC'? Do you mean the 'actual' underlying database could be i.e MySQL, Postgres,etc? My understanding is SAP ECC IS the database. – Afiq Johari Feb 07 '20 at 04:59
  • @AfiqJohari `My understanding is SAP ECC IS the database` your understanding is wrong. The underlying database can be HANA, MySQL, MaxDB, MSSQL, Oracle, postgres is rare but do exists too. – Suncatcher Feb 07 '20 at 06:29
  • Also beware of table readers (RFC_READ_TABLE/BBP_RFC_READ_TABLE) [limitations](https://rfcconnector.com/documentation/kb/0007/): you cannot retrieve all columns, you cannot retrieve table width longer than 512 chars, you cannot correctly retrieve table with FLOAT columns etc. – Suncatcher Feb 07 '20 at 06:34
  • Ans PyRFC also has some [bugs](https://github.com/SAP/PyRFC/issues/97) and limitations, inherent to RFC – Suncatcher Feb 07 '20 at 06:38
  • Indeed, SAP ECC is not the database, listen to @Suncatcher he's giving good advice. It is common that companies don't give direct access to the DB though and this is what guided my reply. – arg0naut91 Feb 07 '20 at 08:10
1

Consider using RODBC. This package allows adding different ODBC sources and use them in R Studio.

Follow this article and don't bug to word "HANA", this approach allows using any database, not only HANA.

Suncatcher
  • 10,355
  • 10
  • 52
  • 90