2

I have an azure synapse workspace that contains a number of pipelines & external tables in the serverless sql pool. all associated with one particular project.

There are another 2-3 completely separate projects on the way that will require a synapse toolset.

Should i create a new workspace, or allow them all to share this one? What is the best criteria to use to decide?

wilson_smyth
  • 1,202
  • 1
  • 14
  • 39
  • 1
    It's totally a business decision, so I doubt there is any real guidance on this. The best delineator is probably security. If the separate projects have different users, access controls, storage accounts, etc., that's probably a good way to decide. – Joel Cochran Jul 20 '21 at 13:23

1 Answers1

5

This is probably a bit of an opinion question which don't tend to do that well on StackOverflow, but that said, I tend to think of Synapse Workspaces as similar to an instance of SQL Server, so historically, why would you have used the same SQL instance?

Generally this was where projects have things have in common, eg same data, similar permissions (AAD) groups, similar HADR requirements etc, so ask yourself those questions.

Bear in mind you can have multiple databases (dedicated and serverless) within a workspace but cross database queries for tables in a dedicated sql pool are only possible via Spark Pools1. This could work in your favour if you require separation. Also bear in mind you can connect multiple storage accounts to the workspace. There is no cost overhead to having multiple workspaces, but there is an admin overhead and there would be a cost implication to duplicating any of your data across multiple lakes, storage accounts and databases.

One example - we're using workspaces for environments for example where there aren't separate dev, test, uat Azure subscriptions.

So a few things to consider.


1 import the two tables as dataframes then join them in a Synapse notebook as per this example

wBob
  • 13,710
  • 3
  • 20
  • 37
  • As always, solid answer @wBob. I would add that cross-database queries are (basically) possible via Notebooks, but that's the only environment that can really reach across those boundaries. External tables and Spark tables can extend some of those functions into T-SQL to an extent. – Joel Cochran Jul 20 '21 at 13:32
  • lol @JoelCochran I know, we've been through the same pain, I mean I could add as a note eg extend [this one](https://stackoverflow.com/a/66546617/1527504), but it's kind of by-the-by for this answer I think. I know a lot of people frown up upon these opinion ones and it might end up getting closed, but someone is asking for help, some input, so I'll throw in my two-penneth. Maybe it's worth extending that list of criteria which might genuinely be a useful thing - we're using workspaces for environments for example where there aren't separate dev, test, uat Azure subscriptions. – wBob Jul 20 '21 at 13:50
  • great answer @wBob, thank you for the measured reasoned reply. – wilson_smyth Jul 21 '21 at 14:25