4

I am working on a project where I am using a shape file to make a choropleth map of the United States. To do this, I downloaded the standard shape file here from the US Census Bureau. After a little bit of cleaning up (there were some extraneous island territories which I removed by changing the plot's axis limits), I was able to get the contiguous states to fit neatly within the bounds of the matplotlib figure. For reference, please see Edit 4 below.

Edit 1: I am using the cb_2018_us_state_500k.zip [3.2 MB] shape file.

The only problem now is that by setting axis limits I now am no longer able to view Alaska and Hawaii (as these are obviously cut out by restricting the axis limits). I would now like to add both of these polygons back in my map but now towards the lower part of the plot figure (the treatment that is given by most other maps of this type) despite its geographical inaccuracy.

To put this more concretely, I am interested in selecting the polygon shapes representing Alaska and Hawaii and moving them to the lower left hand side of my figure. Is this something that would be possible?

I can create a Boolean mask using:

mask = df['STUSPS'] == 'AK'

to get this polygon for this state by itself; however, I am a little bit stuck now on how to move it/reposition it once selected.

Edit 2: Since each state is represented by a geometry dtype, could I just apply a transformation to each point in the polygon? For Alaska the geometry column shows:

27    MULTIPOLYGON (((179.48246 51.98283, 179.48656 ...
Name: geometry, dtype: geometry

Would say multiplying each number in this list by the same constant accomplish this?

I would like to place Alaska down in the lower left somewhere around the (-125, 27) region and Hawaii next to it at around (-112, 27).

Edit 3:

My code:

import geopandas as gpd
import matplotlib.pyplot as plt

# import the United States shape file 
df = gpd.read_file('Desktop/cb_2018_us_state_500k/cb_2018_us_state_500k.shp')

# exclude the values that we would not like to display
exclude = df[~df['STUSPS'].isin(['PR', 'AS', 'VI', 'MP', 'GU','AK'])]

# create a plot figure
fig, ax = plt.subplots(1, figsize=(20, 10))

exclude.plot(column="NAME", ax=ax)
_ = ax.set_xlim([-130, -64])
_ = ax.set_ylim([22, 53])

Example figure that I have right now:

enter image description here

Any insight or links to resources, explanations, or examples would be greatly appreciated.

Note: Technically, I could obviate this by simply using a shape file that already has Alaska and Hawaii in this location, such as the one provided by the Department of Interior; however, this would not work if I wanted to say add Guam or Puerto Rico.

Desired Outcome:

Edit 4: What I am looking to do is something similar to this question, but in Python and not R.

enter image description here

Image source: Murphy

Ethan
  • 876
  • 8
  • 18
  • 34

2 Answers2

1

You could do something like this. You will have to find the right offsets to position Alaska where you want it to be exactly.

Now, you have the following dataframe:

 STATEFP   STATENS     AFFGEOID GEOID STUSPS            NAME LSAD  \
0      28  01779790  0400000US28    28     MS     Mississippi   00   
1      37  01027616  0400000US37    37     NC  North Carolina   00   
2      40  01102857  0400000US40    40     OK        Oklahoma   00   

          ALAND       AWATER  \
0  121533519481   3926919758   
1  125923656064  13466071395   
2  177662925723   3374587997   

                                            geometry  
0  MULTIPOLYGON (((-88.50297 30.21523, -88.49176 ...  
1  MULTIPOLYGON (((-75.72681 35.93584, -75.71827 ...  
2  POLYGON ((-103.00257 36.52659, -103.00219 36.6...  

You can extract Alaska the way you did, or something similar:

USA_ALS = USA[USA.STUSPS=='AK']

From there you do the following:

USA_ALS['geometry'] = USA_ALS.geometry.apply(lambda x: shapely.affinity.translate(x, xoff=0, yoff=-100))

which translates the geometry along the x and y axis.

Remove the old Alska from the original df and concatenate with the new one:

USA = USA[USA.STUSPS!='AK']
New_USA = pd.concat([USA,USA_ALS])
New_USA.geometry.plot()

This gives:

enter image description here

``

  • Thank you for your response. I think I follow what you are doing here. How would we scale the polygon down too? – Ethan Sep 22 '21 at 23:20
  • For scaling, use GeoSeries.scale(xfact=1.0, yfact=1.0, zfact=1.0, origin='center'). Example ```geo_df.geometry = df.geometry.scale(xfact=1/10000, yfact=1/10000, zfact=1.0, origin=(0, 0))``` . Do this before moving the polygon. – Serge de Gosson de Varennes Sep 23 '21 at 02:59
  • Thank you, this worked. Quick question though - its seems like most maps like this have Alaska looking more "squished." Is the longer Alaska look here an artifact of this particular shape file? – Ethan Sep 23 '21 at 05:04
  • Yes it is. But using scaling you may restore the shape. – Serge de Gosson de Varennes Sep 23 '21 at 05:34
  • Would you recommend scaling only on the x axis? – Ethan Sep 23 '21 at 16:09
1

You can simply achieve the end goal with ax.inset_axes(), and then specify different longitude / latitude bounds on the inset plots.

Here’s a simple way of doing that:

# import the United States shape file
df = gpd.read_file('cb_2018_us_state_500k/cb_2018_us_state_500k.shp')

# set state code as index, exclude states that we will never display
df = df.set_index('STUSPS').drop(index=['PR', 'VI', 'MP', 'GU', 'AS'])

# create an axis with 2 insets − this defines the inset sizes
fig, continental_ax = plt.subplots(figsize=(20, 10))
alaska_ax = continental_ax.inset_axes([.08, .01, .20, .28])
hawaii_ax = continental_ax.inset_axes([.28, .01, .15, .19])

# Set bounds to fit desired areas in each plot
continental_ax.set_xlim(-130, -64)
continental_ax.set_ylim(22, 53)

alaska_ax.set_ylim(51, 72)
alaska_ax.set_xlim(-180, -127)

hawaii_ax.set_ylim(18.8, 22.5)
hawaii_ax.set_xlim(-160, -154.6)

# Plot the data per area - requires passing the same choropleth parameters to each call
# because different data is used in each call, so automatically setting bounds won’t work
vmin, vmax = df['ALAND'].agg(['min', 'max'])
df.drop(index=['HI', 'AK']).plot(column="ALAND", ax=continental_ax, vmin=vmin, vmax=vmax)
df.loc[['AK']].plot(column="ALAND", ax=alaska_ax, vmin=vmin, vmax=vmax)
df.loc[['HI']].plot(column="ALAND", ax=hawaii_ax, vmin=vmin, vmax=vmax)

# remove ticks
for ax in [continental_ax, alaska_ax, hawaii_ax]:
    ax.set_yticks([])
    ax.set_xticks([])

Here’s the result, with as you can see the colour proportional to each state’s land mass:

enter image description here

Cimbali
  • 11,012
  • 1
  • 39
  • 68
  • Incredible solution, especially the idea of using inset_axes() is perfect. Thank you for taking the time to provide this detailed solution. Just one quick clarification, are inset plots related to the original geodataframe or are they separate? For example, if we append a column to the geodataframe representing population, would plotting this data map the values for Alaska and Hawaii to their respective inset plots? – Ethan Oct 03 '21 at 16:28
  • 1
    You’re plotting the same geodataframe on the inserts and on the main plot @Ethan. It’s a 3 separate `.plot()` calls, so if you add a population column, just make sure to use `name='population'` in all 3 plot commands (and in the min/max calculation) and you’re good. – Cimbali Oct 03 '21 at 16:47