1

I am trying to use Julia to fit a set of different linear mixed models to the same dataset using pmap (Mac OSX 12.6.3, Julia 1.9). I can successfully fit a linear mixed model and run pmap using a native function, like pmap(sqrt, 1:100). I wrote a function called fit_mm that takes as input a string that refers to a column of my dataset, a basic formula to update, and the dataset called df. I can run this using map successfully. However, when I try to use pmap, I get the following error.

julia> # Run map
       pmap(x -> fit_mm(x, baseFormula, df), covs[1:10])
ERROR: On worker 3:
KeyError: key StatsModels [3eaba693-59b7-5ba5-a881-562e759f1c8d] not found

Here's the summarized code I am using


using Distributed 
@everywhere using MixedModels
@everywhere using DataFrames
@everywhere using DataFramesMeta
@everywhere using CategoricalArrays
@everywhere using CSV
@everywhere using Term
@everywhere using StatsModels
@everywhere using DelimitedFiles

addprocs(2)

# Function to update formula 
@everywhere begin
    function fit_mm(testVar, f, data)
       ....
    end        
end

# Load data of interest...
# Run map
pmap(x -> fit_mm(x, baseFormula, df), covs) # covs is the list of variables I want to use to modify baseFormula and df is the data
  • Please post a minimum reproducible example (https://stackoverflow.com/help/minimal-reproducible-example). In your case the problem seems to be in including the `StatsModels` package. So, do not truncate the `using` statement either and show all packages so that others may reproduce your error. – loonatick Jun 25 '23 at 08:30
  • Thank you @loonatick. I edited the question as suggested – Claudio Esteban Pérez Leighton Jun 25 '23 at 17:16
  • That's better, but still not reproducible since you left out the definition of `fit_mm` – loonatick Jun 26 '23 at 12:56
  • Instead of `@everywhere begin function` you can just write `@everywhere function`. As mentioned by others it is not possible to provide the answer without a minimal piece of code that replicates the error. Maybe make some small simple dataframe that shows this behaviour? – Przemyslaw Szufel Jun 26 '23 at 16:15

1 Answers1

0

I apologize for not providing a reproducible example this time. I will make sure to do so next time I ask a question. Anyway, I just wanted to let everyone who might read this that I solved the issue in the hopes it can help someone. As I closed and/or re-started Julia, I noticed that the error message kept changing in the worker that had a problem (I had four workers, and sometimes the error was assigned to worker 2, sometimes to worker 3, and so on) and the package that was not found. I finally figured out that if the addprocs(2) command creates the workers, then I should create the workers before declaring the use of packages or functions so the packages and functions could be loaded into all workers. So I moved the addprocs statement right after declaring the use of Distributed package. So the final code was

using Distributed

addprocs(2)
 
@everywhere using MixedModels
@everywhere using DataFrames
@everywhere using DataFramesMeta
@everywhere using CategoricalArrays
@everywhere using CSV
@everywhere using Term
@everywhere using StatsModels
@everywhere using DelimitedFiles

... and then the rest of the code.

This led to the code running without a problem.