0

In a local repository, I have several json files. When I run the command

from datasets import load_dataset
dataset = load_dataset('json', data_files=['./100009.json'])

I got the following error:

OSError: [Errno 36] File name too long: '/home/infinity/.cache/huggingface/datasets/_home_infinity_.cache_huggingface_datasets_json_default-80a93068b3a4a494_0.0.0_83d5b3a2f62630efc6b5315f00f20209b4ad91a00ac586597caee3a4da0bef02.lock'

Maybe it is obvious, but I am not sure how to solve it. Can you help?

EDIT

Here is the content of the json file :

{
    "id": "68af48116a252820a1e103727003d1087cb21a32",
    "article": [
        "by mark duell .",
        "published : .",
        "05:58 est , 10 september 2012 .",
        "| .",
        "updated : .",
        "07:38 est , 10 september 2012 .",
        "a pet owner starved her two dogs so badly that one was forced to eat part of his mother 's dead body in a desperate attempt to survive .",
        "the mother died a ` horrendous ' death and both were in a terrible state when found after two weeks of starvation earlier this year at the home of katrina plumridge , 31 , in grimsby , lincolnshire .",
        "the barely-alive dog was ` shockingly thin ' and the house had a ` nauseating and overpowering ' stench , grimsby magistrates court heard .",
        "warning : graphic content .",
        "horrendous : the male dog , scrappy -lrb- right -rrb- , was so badly emaciated that he ate the body of his mother ronnie -lrb- centre -rrb- to try to survive at the home of katrina plumridge in grimsby , lincolnshire .",
        "the suffering was so serious that the female staffordshire bull terrier , named ronnie , died of starvation , nigel burn , prosecuting , told the court last friday .",
        "suspended jail term : the dogs were in a terrible state when found after two weeks of starvation at the home of katrina plumridge , 31 -lrb- pictured -rrb- .",
        "the male dog , her son scrappy , was so badly emaciated that he ate her body to try to survive .",
        "` the degree of suffering caused to both dogs was extreme and prolonged , ' mr burn said . ` it was as severe and extreme as it can get . '",
        "the alarm was raised when a letting agent visited her home and saw dog mess on the steps , stairs , an upstairs floor and a bed .",
        "a painfully thin dog jumped past him . he said its ribs , spine and hip bones could all be seen and it was the thinnest dog he had ever witnessed .",
        "he tried to go into the kitchen but it was blocked from the inside by the dead body of the mother dog . the letting agent then called the royal society for the prevention of cruelty to animals .",
        "mr burn said : ` every single bone in its frame was visible and the stomach was curved in . the empty dog bowls were bone dry . '",
        "a decorator who went into the house said the stench made him feel physically sick , ronnie was like a skeleton and scrappy was ` shockingly thin ' .",
        "a veterinary surgeon estimated that the dogs would have been suffering from starvation for at least two weeks .",
        "plumridge moved out of the house on march 28 but the dogs were n't found until april 19 . she had claimed a friend was supposed to be finding new homes for the dogs and left them without going back to check on them .",
    ],
    "abstract": [
        "neglect by katrina plumridge saw staffordshire bull terrier ronnie die .",
        "dog 's son scrappy was forced to eat her to survive at grimsby house .",
        "alarm raised by letting agent shocked by ` thinnest dog he 'd ever seen '",
    ]
Michael
  • 19
  • 6
  • possible duplicate of [this question](https://stackoverflow.com/questions/4677234/python-ioerror-exception-when-creating-a-long-file). – Edo Akse May 26 '21 at 11:50
  • Which OS do you use? Windows has a limit of 260 characters for paths. – Andreas May 26 '21 at 11:59
  • @Andreas I am using ubuntu I just don't understand as I provide just the path "./100009" as a path. – Michael May 26 '21 at 12:05
  • not quite sure : len("/home/infinity/.cache/huggingface/datasets/_home_infinity_.cache_huggingface_datasets_json_default-80a93068b3a4a494_0.0.0_83d5b3a2f62630efc6b5315f00f20209b4ad91a00ac586597caee3a4da0bef02.lock") 191 and unix paths are up to 255 – laenNoCode May 26 '21 at 12:10
  • what is the content of 10009.json ? does it contain the big path that is causing the crash ? – laenNoCode May 26 '21 at 12:12
  • @laenNoCode I will modify the code to show you its content. Let me two minute – Michael May 26 '21 at 12:26

2 Answers2

0

When you working on large datasets, it is appropriate to use a pandas dataframe.

import pandas as pd 
df= pd.read_json(r'Path where you saved the JSON file\File Name.json') 
print (df)
0

This looks to be a bug in the huggingface library. It's attempting to read or write a filename that's too long for the underlying file system (likely ext4 in the case of Ubuntu). I opened an issue here.