You can obtain a reasonable 3D point cloud from multiple images.
Take a look at this course:
http://3dvision.princeton.edu/courses/SFMedu/
Wich include some matlab code that generates SfM point clouds from multiple images and also some slides that should answer your second question.
This matlab code generates a Sparse Point Cloud. Pix4D and Agisoft perform a densification of this sparse point cloud as an extra step.
This densification is highly expensive and I would say that it's impractical to try to implement it in matlab.
However, there are some open-source alternatives that can perform this densification:
https://github.com/colmap/colmap
I'm not sure but I think that you could feed the output of your matlab SfM to colmap and then perform the densification.
The colmap documentation should also help you to understand how both SfM and Multi-View Stereo work.