This is probably the most obfuscated manner of storing content that I've ever encountered. After performing text searches against the database I've finally found where the content is stored but it's not a simple process to get it out.
Pages' master record appears to be sf_page_node, there are related tables:
- sf_object_data (page_id is related to sf_page_node.content_id)
- sf_draft_pages (page_id is related to sf_page_node.content_id)
- sf_page_data (content_id is related to sf_page_node.content_id)
- sf_control_properties (control_id is related to sf_object_data.id)
So you could get the info you need with a query like this:
select * from
[sf_page_node]
join sf_object_data on sf_page_node.content_id = sf_object_data.page_id
join sf_control_properties on sf_object_data.id = sf_control_properties.control_id
Other things to consider:
- the parent_id field is related to the sf_page_node table, so if you're writing a script, be sure to query this as well
- the page may have a banner image, you will pick up the "place_holder" value as 'BannerHolder' with a caption of "Image" The image may be stored as blobs in sf_media_content, you should handle this separately. The "nme" value of 'ImageId' will have a GUID in the "val" column. You can query sf_media_content with this value as "content_id" the actual binary data is stored in sf_chunks, they relate on "file_id"
My revised query taking into account what I'll need to migrate content is below:
select
original.content_id,
original.url_name_,
original.title_,
parent.id,
parent.url_name_,
parent.title_,
place_holder,
sf_object_data.caption_,
sf_control_properties.nme,
val
from [sf_page_node] original
join sf_object_data on original.content_id = sf_object_data.page_id
join sf_control_properties on sf_object_data.id = sf_control_properties.control_id
join sf_page_node parent on original.parent_id = parent.id
I hope this helps someone!