I have inherited a ruby app which connects to a mongodb. I have no idea about mongo or ruby unfortunately so im on a rapid googling and learning curve.
The app stores placenames as well as their lat longs, alternative name, peoples memories, and comments. It also counts how many times a place has been discussed.
The following rake file when run, grabs all the locations from the mongodb and creates a csv,spitting out one line for each location with the user, number of times mentioned, the memories etc etc.
task :data_dump => :environment do
File.open("results.csv","w") do |file|
Location.all.each_with_index do |l,index|
puts "done #{index}"
file.puts [l.id, l.classification_count, l.position, l.created_at, l.classifications.collect{|c| c.text}, l.classifications.collect{|c| c.alternative_names }.flatten.join(";"), l.classifications.collect{|c| c.comment }.flatten.join(";"), l.memories.collect{|m| m.text}.flatten.join(";") ].join(",")
end
end
end
It works great and generates a CSV I can then pull into other programmes. The problem is that the content contains plain text fields which breaks the validity of the csv with line breaks etc and I want to make sure all plain text fields are properly enclosed within the CSV.
So if I can understand the above query better, I can then input the correct field enclosures to ensure the csv is valid when loaded into GIS software.
Also the above takes about an hour 45 to run on my laptop so I want to find out if it is the most efficient way to do the query. To date we have around 300000 placename listed and this is going to rise to a few million so will only get slower.