I want to run two t-tests on some data I have on crab and lobster landed weight in North and South Wales, one separate test for each species at the moment. I have log-transformed both weight columns as both had lots of very low values. I run the following code on both species:
t.test(data = crabs, logweight~Region)
t.test(data = lobsters, logweight~Region)
For crabs the t-test works fine and I get an output in the console, however for the lobster data I get the following error message:
Error in if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my))) stop("data are essentially constant") :
missing value where TRUE/FALSE needed
This seems to be an error message that happens when you try to use non-numeric data. The data for weight is definitely numeric and I have even tried converting Region into numeric values of 1 and 2 instead of North and South but I am still getting this error message. If I run the t-test on the untransformed data it works fine, so the issue appears to be with the log-transformed lobster weight data. What is the problem here and how can I fix it?
This is what the raw data with the logweight column added looks like
Some example data:
structure(list(Weight = c(130, 10, 25, 45, 21, 75, 100, 9.6, 12.9, 17.1, 11, 11, 28, 8, 50, 30, 9.5, 28.5, 91, 16), Region = c("NORTH", "NORTH", "NORTH", "NORTH", "NORTH", "NORTH", "NORTH", "NORTH", "NORTH", "NORTH", "SOUTH", "SOUTH", "SOUTH", "SOUTH", "SOUTH", "SOUTH", "SOUTH", "SOUTH", "SOUTH", "SOUTH")), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L))