Where to Start
In order to gain some knowledge on SLAM and Computer Vision I would recommend watching Cyrill Stachniss' SLAM course and reading the papers ORB-SLAM, ORB-SLAM2, ORB-SLAM3, and DSO. For Computer Vision I recommend reading R. Szeliski book.
Which Language to Use
I wrote my thesis on SLAM and AR systems, and the outcome is the following: State-of-the-art SLAM systems which achieve the best accuracy are still using machine learning techniques: SURF, ORB descriptors, Bag of Words (BoW) etc. All of the systems (ORB-SLAM3, DM-VIO, DSO) are written in C++.
I'm always using C++ for programming SLAM, and only sometimes I use Python to write scripts for example to fix the recovered trajectory.
SLAM + AR
There's no much resources on this subject, although the idea is simple. SLAM system has to give you the camera location, usually as the 4x4 transformation matrix, where the first 3x3 matrix is the rotation matrix, and the last 3x1 column is the translation part. Example of the transformation matrix.
Having the camera location, you can use the projective geometry to project the AR objects on the camera frame. ORB-SLAM2 has a nice AR demo to study; basically they display a 2D image, and put the 3D rendered image on top of that.
They use Pangolin, so you need to know how to use OpenGL, Pangolin. I recommend studying Pangolin by its' examples, as it mostly documented through them.