0

Over the past 20 years a bunch of wildlife counts have been conducted in a whole bunch of places across a country. When a species is detected, the amount of individuals is written down along with the location, year etc. If a species is not detected, it is not included in the count (not even as 0 of this species detected).

I have two datasheets. One with all the count data (i.e. how many of each species were seen at which location each year) and a sheet with where and when each count was conducted. I think to run any meaningful stats on this dataset I would need to add a 0 count to every year and location when a species wasn't detected when it had been detected at this location in different years.

For example if I had a dataset that ran from 2003-2008 and Species A was detected at Location B in 2004, 2005, 2006, and 2008, I would like to add a count of 0 for species A at Location B in 2003 and 2007.

Is there an automatic way to do this? My data set is about 34000 rows so doing this manually would probably take weeks and be prone to error.

Essentially I'm wondering if there is a way to take all the counts for each year/species/location combo and compare them to my datasheet detailing each year and location a count had taken place then for each location/year not found in the year/species/location combo add a row with the year/species/location and a count of 0. Ideally this would only be done for places where the species had been detected at some point over the 20 years of data. Some species will never be found in certain areas so it's unnecessary to include a count for these places for those species.

Any help would be appreciated.

user3403745
  • 83
  • 1
  • 2
  • 7

2 Answers2

2

This can be done without adding fake data. Set up a table for year, another table for location and one for species. Add these three dimension tables and the count table to the Excel data model.

Create relationships between the count table and the dimension tables.

enter image description here

Then build a pivot table with the year, species and location from the dimension tables instead of the count table. In the pivot table options, on the Display tab, select to show items with no data on rows and columns. Now all the years, species and locations will show in the pivot table. If a combination does not have a count, it will show empty.

enter image description here

teylyn
  • 34,374
  • 4
  • 53
  • 73
  • nice answer there + – QHarr Dec 16 '18 at 19:43
  • Beg your pardon, but what's the point of creating separate tables for year, location and species when you use *same* values in `tblCount` as in those tables? – JohnyL Dec 16 '18 at 19:50
  • Hey thanks so much for helping out. I was able to replicate that with all my data and have a nice table and can see where and when species are missing. What I'm hoping to end up with is a table back in the original form (with location/year/species etc. as headers) where each species has a row for each location and year it wasn't detected with a count value of 0. Is there a way to take this pivot table, identify where empty cells are (ideally only when other cells for this location/species have a value during another year) and input it back into my original table? – user3403745 Dec 16 '18 at 20:38
  • @JohnyL Using the dimension tables in the pivot table enables you to show rows with no data. If you used the fact table for these fields, it would only show you rows with data. Also, this is a very simplified example. IRL the dimension tables would hold much more information and the fact tables would have just a key. For example, you could have the locations grouped into regions. Then you can create a pivot by region, even if you don't have the region value in the fact table. – teylyn Dec 16 '18 at 23:41
  • In case anyone is wondering how to get the rows into a normal-looking table I found the solution. So follow teylyn's advice but put species, location, and year into rows with numbers in values. Put the pivot table into tabular form. Follow teyln's picture about adding items with no data in rows/columns. Now you have a chart showing all missing values. If you only want to take empty rows from locations where the species was seen at one point then you have create an IF statement to identify these locations (I had mine show up as 1 if it was a new place with a sum of values > 0) then use a macro. – user3403745 Dec 17 '18 at 20:15
1

There might be a better solution to this but you can identify & replace the blank cells in a selected area using 'Go to' function (shortcut: CTRL+G).

  1. You'll need to take a cell containing zero (0) and copy it.
  2. select the entire data block where you wish to replace blank cells with zeroes.
  3. Press CTRL+G > select 'Blanks' > OK. This will highlight all Blank cells in the selected area.
  4. Then RIGHT CLICK on any one highlighted cell > Paste Special > Values (V).

This would replace all the blanks with zeroes in your original sheet. You can then import this data wherever you want.

ParvBanks
  • 1,316
  • 1
  • 9
  • 15