I'm using a Ruby script to do a lot of manipulation and cleaning to get this, and a bunch of other files, ready for import.
I have a really large file with some data that I'm trying to import into a database. There are some data issues with newline characters being in the data where they should not be, messing with the import.
I was able to solve this problem with sed using this:
sed -i '.original' -e ':a' -e 'N' -e '$!ba' -e 's/Oversight Bd\n/Oversight Bd/g' -e 's/Sciences\n/Sciences/g' combined_old_individual.txt"
However, I can't call that command from inside a Ruby script, because Ruby messes up interpreting the newline characters and won't run that command. sed needs the non-escaped newline character but when calling a system command from Ruby it needs a string, where the newline character needs to be escaped.
I also tried doing this using Ruby's file method, but it's not working either:
File.open("combined_old_individual.txt", "r") do |f|
File.open("combined_old_individual_new.txt","w") do |new_file|
to_combine = nil
f.each_line do |line|
if(/Oversight Bd$/ =~ line || /Sciences$/ =~ line)
to_combine = line
else
if to_combine.nil?
new_file.puts line
else
combined_line = to_combine + line
new_file.puts combined_line
to_combine = nil
end
end
end
end
end
Any ideas how I can join lines where the first line ends with "Bd" or "Sciences", from within a Ruby script, would be very helpful.
Here's an example of what might go in a testfile.txt:
random line
Oversight Bd
should be on the same line as the above, but isn't
last line
and the result should be
random line
Oversight Bdshould be on the same line as the above, but isn't
last line