0

I would like to create a synchronized file writing mechanism for Spring application. I have about 10 000 000 jsons which should be saved in separate files e.g:

  • text "abc" from json: { "id": "1", "text": "abc" } should be saved into the "1.json" file
  • text "pofd" from json: { "id": "2", "text": "pofd" } should be saved into the "2.json" file

Other requirements:

  • it is possible to write into multiple files at the same time
  • one file can be updated by multiple threads at the same time (many jsons with the same id but different text)

I've created FileWriterProxy (singleton bean) which is the main component for saving files. It loads lazy FileWriter component (prototype bean) which is responsible for writing into a file (it has synchronized write method). Each FileWriter object represents a separate file. I'm suspecting that my solution is not thread safety. Let's consider the following scenario:

  1. there are 3 threads (Thread1, Thread2, Thread3) which want to write into the same file (1.json), all of them hit the write method from FileWriterProxy component
  2. Thread1 is getting the correct FileWriter
  3. Thread1 is locking FileWriter for 1.json file
  4. Thread1 is writing into 1.json file
  5. Thread1 is finishing writing into the file and going to remove FileWriter from ConcurrentHashMap
  6. in meanwhile Thread2 is getting FileWriter for the 1.json file and waiting for Thread1 to release lock
  7. Thread1 is releasing the lock and removing FileWriter from ConcurrentHashMap
  8. now Thread2 can write into the 1.json file (it has FileWriter which has been removed from ConcurrentHashMap)
  9. Thread3 is getting FileWriter for 1.json (a new one! old FileWriter has been removed by Thread1)
  10. Thread2 and Thread3 are writing into the same file at the same time because they have lock on different FileWriters objects

Please correct me if I'm wrong. How can I fix my implementation?

FileWriterProxy:

@Component
public class FileWriterProxy {
    private final BeanFactory beanFactory;
    private final Map<String, FileWriter> filePathsMappedToFileWriters = new ConcurrentHashMap<>();

    public FileWriterProxy(BeanFactory beanFactory) {
        this.beanFactory = beanFactory;
    }

    public void write(Path path, String data) {
        FileWriter fileWriter = getFileWriter(path);
        fileWriter.write(data);
        removeFileWrite(path);
    }

    private FileWriter getFileWriter(Path path) {
        return filePathsMappedToFileWriters.computeIfAbsent(path.toString(), e -> beanFactory.getBean(FileWriter.class, path));
    }

    private void removeFileWrite(Path path) {
        filePathsMappedToFileWriters.remove(path.toString());
    }

}

FileWriterProxyTest:

@RunWith(SpringRunner.class)
@SpringBootTest
public class FileWriterProxyTest {

    @Rule
    public TemporaryFolder temporaryFolder = new TemporaryFolder();
    private static final String FILE_NAME = "filename.txt";
    private File baseDirectory;
    private Path path;

    @Autowired
    private FileWriterProxy fileWriterProxy;

    @Before
    public void setUp() {
        baseDirectory = temporaryFolder.getRoot();
        path = Paths.get(baseDirectory.getAbsolutePath(), FILE_NAME);
    }

    @Test
    public void writeToFile() throws IOException {
        String data = "test";
        fileWriterProxy.write(path, data);
        String fileContent = new String(Files.readAllBytes(path));
        assertEquals(data, fileContent);
    }

    @Test
    public void concurrentWritesToFile() throws InterruptedException {
        Path path = Paths.get(baseDirectory.getAbsolutePath(), FILE_NAME);
        List<Task> tasks = Arrays.asList(
                new Task(path, "test1"),
                new Task(path, "test2"),
                new Task(path, "test3"),
                new Task(path, "test4"),
                new Task(path, "test5"));
        ExecutorService executorService = Executors.newFixedThreadPool(5);
        List<Future<Boolean>> futures = executorService.invokeAll(tasks);

        wait(futures);
    }

    @Test
    public void manyRandomWritesToFiles() throws InterruptedException {
        List<Task> tasks = createTasks(1000);
        ExecutorService executorService = Executors.newFixedThreadPool(5);
        List<Future<Boolean>> futures = executorService.invokeAll(tasks);
        wait(futures);
    }

    private void wait(List<Future<Boolean>> tasksFutures) {
        tasksFutures.forEach(e -> {
            try {
                e.get(10, TimeUnit.SECONDS);
            } catch (Exception e1) {
                e1.printStackTrace();
            }
        });
    }

    private List<Task> createTasks(int number) {
        List<Task> tasks = new ArrayList<>();

        IntStream.range(0, number).forEach(e -> {
            String fileName = generateFileName();
            Path path = Paths.get(baseDirectory.getAbsolutePath(), fileName);
            tasks.add(new Task(path, "test"));
        });

        return tasks;
    }

    private String generateFileName() {
        int length = 10;
        boolean useLetters = true;
        boolean useNumbers = false;
        return RandomStringUtils.random(length, useLetters, useNumbers) + ".txt";
    }

    private class Task implements Callable<Boolean> {
        private final Path path;
        private final String data;

        Task(Path path, String data) {
            this.path = path;
            this.data = data;
        }

        @Override
        public Boolean call() {
            fileWriterProxy.write(path, data);
            return true;
        }
    }
}

Config:

@Configuration
public class Config {

    @Bean
    @Lazy
    @Scope("prototype")
    public FileWriter fileWriter(Path path) {
        return new FileWriter(path);
    }

}

FileWriter:

public class FileWriter {
    private static final Logger logger = LoggerFactory.getLogger(FileWriter.class);

    private final Path path;

    public FileWriter(Path path) {
        this.path = path;
    }

    public synchronized void write(String data) {
        String filePath = path.toString();
        try {
            Files.write(path, data.getBytes());
            logger.info("File has been saved: {}", filePath);
        } catch (IOException e) {
            logger.error("Error occurred while writing to file: {}", filePath);
        }
    }

}
pfalek
  • 1
  • 1
  • can the same string get mapped to different FileWritters at any time ? – samvel1024 Nov 26 '18 at 20:30
  • Yes, it is possible that string path will be mapped to different FileWriters. The easiest solution would be to not delete FileWriter from map at all but keeping so much objects can absorb to much memory and decrease performance. – pfalek Nov 26 '18 at 20:45
  • I don't get `one file can be updated by multiple threads`. What do you mean? Concurrent writing to a file seems a little pointless to me (if I don't miss something important). Please refer to this post to get my point better https://stackoverflow.com/questions/8602466/can-multiple-threads-write-data-into-a-file-at-the-same-time – samvel1024 Nov 26 '18 at 21:01
  • Maintain a count how many write requests are scheduled for a given file along with the writer; decrease when writing is finished. Only remove the writer from the map if the count hits zero. This way, it will always be the same writer for the same file. – daniu Nov 26 '18 at 21:07

0 Answers0