I would like to create a synchronized file writing mechanism for Spring application. I have about 10 000 000 jsons which should be saved in separate files e.g:
- text "abc" from json: { "id": "1", "text": "abc" } should be saved into the "1.json" file
- text "pofd" from json: { "id": "2", "text": "pofd" } should be saved into the "2.json" file
Other requirements:
- it is possible to write into multiple files at the same time
- one file can be updated by multiple threads at the same time (many jsons with the same id but different text)
I've created FileWriterProxy (singleton bean) which is the main component for saving files. It loads lazy FileWriter component (prototype bean) which is responsible for writing into a file (it has synchronized write method). Each FileWriter object represents a separate file. I'm suspecting that my solution is not thread safety. Let's consider the following scenario:
- there are 3 threads (Thread1, Thread2, Thread3) which want to write into the same file (1.json), all of them hit the write method from FileWriterProxy component
- Thread1 is getting the correct FileWriter
- Thread1 is locking FileWriter for 1.json file
- Thread1 is writing into 1.json file
- Thread1 is finishing writing into the file and going to remove FileWriter from ConcurrentHashMap
- in meanwhile Thread2 is getting FileWriter for the 1.json file and waiting for Thread1 to release lock
- Thread1 is releasing the lock and removing FileWriter from ConcurrentHashMap
- now Thread2 can write into the 1.json file (it has FileWriter which has been removed from ConcurrentHashMap)
- Thread3 is getting FileWriter for 1.json (a new one! old FileWriter has been removed by Thread1)
- Thread2 and Thread3 are writing into the same file at the same time because they have lock on different FileWriters objects
Please correct me if I'm wrong. How can I fix my implementation?
FileWriterProxy:
@Component
public class FileWriterProxy {
private final BeanFactory beanFactory;
private final Map<String, FileWriter> filePathsMappedToFileWriters = new ConcurrentHashMap<>();
public FileWriterProxy(BeanFactory beanFactory) {
this.beanFactory = beanFactory;
}
public void write(Path path, String data) {
FileWriter fileWriter = getFileWriter(path);
fileWriter.write(data);
removeFileWrite(path);
}
private FileWriter getFileWriter(Path path) {
return filePathsMappedToFileWriters.computeIfAbsent(path.toString(), e -> beanFactory.getBean(FileWriter.class, path));
}
private void removeFileWrite(Path path) {
filePathsMappedToFileWriters.remove(path.toString());
}
}
FileWriterProxyTest:
@RunWith(SpringRunner.class)
@SpringBootTest
public class FileWriterProxyTest {
@Rule
public TemporaryFolder temporaryFolder = new TemporaryFolder();
private static final String FILE_NAME = "filename.txt";
private File baseDirectory;
private Path path;
@Autowired
private FileWriterProxy fileWriterProxy;
@Before
public void setUp() {
baseDirectory = temporaryFolder.getRoot();
path = Paths.get(baseDirectory.getAbsolutePath(), FILE_NAME);
}
@Test
public void writeToFile() throws IOException {
String data = "test";
fileWriterProxy.write(path, data);
String fileContent = new String(Files.readAllBytes(path));
assertEquals(data, fileContent);
}
@Test
public void concurrentWritesToFile() throws InterruptedException {
Path path = Paths.get(baseDirectory.getAbsolutePath(), FILE_NAME);
List<Task> tasks = Arrays.asList(
new Task(path, "test1"),
new Task(path, "test2"),
new Task(path, "test3"),
new Task(path, "test4"),
new Task(path, "test5"));
ExecutorService executorService = Executors.newFixedThreadPool(5);
List<Future<Boolean>> futures = executorService.invokeAll(tasks);
wait(futures);
}
@Test
public void manyRandomWritesToFiles() throws InterruptedException {
List<Task> tasks = createTasks(1000);
ExecutorService executorService = Executors.newFixedThreadPool(5);
List<Future<Boolean>> futures = executorService.invokeAll(tasks);
wait(futures);
}
private void wait(List<Future<Boolean>> tasksFutures) {
tasksFutures.forEach(e -> {
try {
e.get(10, TimeUnit.SECONDS);
} catch (Exception e1) {
e1.printStackTrace();
}
});
}
private List<Task> createTasks(int number) {
List<Task> tasks = new ArrayList<>();
IntStream.range(0, number).forEach(e -> {
String fileName = generateFileName();
Path path = Paths.get(baseDirectory.getAbsolutePath(), fileName);
tasks.add(new Task(path, "test"));
});
return tasks;
}
private String generateFileName() {
int length = 10;
boolean useLetters = true;
boolean useNumbers = false;
return RandomStringUtils.random(length, useLetters, useNumbers) + ".txt";
}
private class Task implements Callable<Boolean> {
private final Path path;
private final String data;
Task(Path path, String data) {
this.path = path;
this.data = data;
}
@Override
public Boolean call() {
fileWriterProxy.write(path, data);
return true;
}
}
}
Config:
@Configuration
public class Config {
@Bean
@Lazy
@Scope("prototype")
public FileWriter fileWriter(Path path) {
return new FileWriter(path);
}
}
FileWriter:
public class FileWriter {
private static final Logger logger = LoggerFactory.getLogger(FileWriter.class);
private final Path path;
public FileWriter(Path path) {
this.path = path;
}
public synchronized void write(String data) {
String filePath = path.toString();
try {
Files.write(path, data.getBytes());
logger.info("File has been saved: {}", filePath);
} catch (IOException e) {
logger.error("Error occurred while writing to file: {}", filePath);
}
}
}