5

Can some one please elaborate on when to use Polybase versus bulk insert in azure datafactory, what are the differences between these two copy methods?

Indra
  • 51
  • 1
  • 1
  • 2

1 Answers1

7

The two options labeled “Polybase” and the “COPY command” are only applicable to Azure Synapse Analytics (formerly Azure SQL Data Warehouse). They are both fast methods of loading which involve staging data in Azure storage (if it’s not already in Azure Storage) and using a fast, highly parallel method of loading to each compute node from storage. Especially on large tables these options are preferred due to their scalability but they do come with some restrictions documented at the link above.

In contrast, on Azure Synapse Analytics a bulk insert is a slower load method which loads data through the control node and is not as highly parallel or performant. It is an order of magnitude slower on large files. But it can be more forgiving in terms of data types and file formatting.

On other Azure SQL databases, bulk insert is the preferred and fast method.

Community
  • 1
  • 1
GregGalloway
  • 11,355
  • 3
  • 16
  • 47