I have a data frame in R where one column, named Title, is a BibTeX entry that looks like this:
={Proofs that yield nothing but their validity or all languages in NP have zero-knowledge proof systems},\n
author={Goldreich, Oded and Micali, Silvio and Wigderson, Avi},\n
journal={Journal of the ACM (JACM)},\n
volume={38},\n
number={3},\n
pages={690--728},\n
year={1991},\n
publisher={ACM New York, NY, USA}\n}
I need to extract only the title for the BibTeX citation, which is the string after ={
and before the next }
In this example, the output should be:
Proofs that yield nothing but their validity or all languages in NP have zero-knowledge proof systems
I need to do this for all rows in the data frame. Not all rows have the same number of BibTeX fields, so the regex has to ignore everything after the first }
I'm currently trying sub(".*\\={\\}\\s*(.+?)\\s*\\|.*$", "\\1", data$Title)
and am met with TRE pattern compilation error 'Invalid contents of {}'
How should I do this?