-1

I have frequency counts that line up with a set number of states of the world

Data=

S <- c("a","b","c","d","e")
n <- c(1,2,3,4,5)
df<- data.frame(S,n)

I want to create some values that line up with the n values for each, named with the relevant subscripts:

Pa = n for state a Pb = n for state b etc.

Even though I can go:

Pa <- 1
Pb <- 2

I will be using a lot of different dataframes that use the sames states but that will yield different values of n each time.

I fear that this is a horribly basic question but what can I do to create a Pa value for every possible n that lines up with state a?

Gilrob
  • 93
  • 7
  • Can you clarify more what you're looking for? `df$states = paste0("P",df$S)`? – Jon Spring Jun 15 '22 at 23:43
  • 2
    I'm wary about any approach in R that relies on a bunch of free-floating variables; it's probably going to be more robust to put the values you want to reference into a list or a data frame. But it's unclear in the question how you want to use these or how they're supposed to be generated. What tells us what value of `n` goes with what states? – Jon Spring Jun 16 '22 at 00:06
  • Sorry @JonSpring , I am not sure how to respond to the first question. I am working from a data set of about 6000 observations at the moment, that has a column for states of the world S. From that I have generated a frequency table for the S column that exists on its own. So the n is just the frequency of variable s in the original df. The issue I have is that the df will change and I want to have a sort of generalized code! – Gilrob Jun 16 '22 at 00:29
  • Please have another go at editing the question to clarify what information you have, what result you want, and what logic gets you there. – Jon Spring Jun 16 '22 at 01:57

1 Answers1

0

Let's try a simple example to see if it is what you are trying to do:

set.seed(42)
states <- sample(letters[1:5], 25, replace=TRUE)
tbl <- table(states)
tbl
# states
# a b c d e 
# 7 6 2 5 5 
states.df <- data.frame(tbl)
states.df
#   states Freq
# 1      a    7
# 2      b    6
# 3      c    2
# 4      d    5
# 5      e    5

The result is a data frame showing the frequency for each state. You can easily reference these values in your code without creating variables for each row, but it is possible to do so:

ls()
# [1] "states"    "states.df" "tbl"    
for(i in seq(nrow(states.df))) assign(paste0("P", states.df[i, 1]), states.df[i, 2])
ls()
# [1] "i"         "Pa"        "Pb"        "Pc"        "Pd"        "Pe"        "states"    "states.df"
[9] "tbl"      
Pa
# [1] 7
Pe
# [1] 5
dcarlson
  • 10,936
  • 2
  • 15
  • 18
  • Hi @dcarlson, thanks for the reply! I have no trouble getting this far, what I want to do is to create values Pa, Pb, [...] Pe that refer to the frequency of each state A, B [...] E but that can be applied to a variety of different dataframes. I am trying to build a general function which calls on Pa, Pb, etc., as part of a pipe that first generates the frequency table, then applies the function. What I am hoping for is that I will be able to simply change the df that the pipe (and therefore function) are working on and generate the different results without having to rewrite each time – Gilrob Jun 16 '22 at 04:24
  • So call the frequency of state A = Pa --- and for df1 Pa = 7, df2 has Pa = 8, df3 has Pa = 4, --- Assuming that the bones of the frequency table object is the same for all df, only the actually frequencies change, I want to create a Pa that always refers to the same cell for every df. – Gilrob Jun 16 '22 at 04:30
  • I agree with @JonSpring. If you would describe what you are trying to accomplish instead of how you want to do it, I'm sure there is a simpler, more efficient way of getting there. – dcarlson Jun 16 '22 at 14:07
  • I have added a way to create the separate variables, Pa, Pb, etc as you requested. – dcarlson Jun 16 '22 at 15:54
  • "I want to create a Pa that always refers to the same cell for every df"? Still unclear to me whether the state frequencies are global assumptions (in which case they should live in a table) or are specific to each source df and derived from the df (in which case that could be a table scoped within a function). – Jon Spring Jun 16 '22 at 17:18
  • "create a Pa that always refers to the same cell for every df" --- is this a lookup from an excel file (where "cell" is more common terminology)? – Jon Spring Jun 16 '22 at 17:21
  • I do not know what the OP is trying to accomplish. The code I provided could be wrapped into a function that takes the `df` as an argument and then creates the P variables from that particular `df`. This might make sense if code already exists referencing those variables and revising the code to make it more flexible would be difficult. – dcarlson Jun 16 '22 at 21:12
  • @dcarlson "You can easily reference these values in your code without creating variables for each row" can you please explain this to me? this might just be the easiest solution? – Gilrob Jun 17 '22 at 00:36
  • I am sorry, I really don't know how to explain what I am after differently from how I have already put it! Thanks both for your time, I may just have to look for a different way to go forward. – Gilrob Jun 17 '22 at 00:39
  • As I said above "If you would describe what you are trying to accomplish instead of how you want to do it, I'm sure there is a simpler, more efficient way of getting there." – dcarlson Jun 17 '22 at 02:57