Methods to acquire, analyze and understand images using mathematical approaches.
Questions tagged [vision]
664 questions
4
votes
5 answers
Should software development be separated from software design & usability?
In most of the commercial jobs I've had to date, my role has mostly been confined to "writing the code", whereas the reason I wanted to get into development in the first place was because I enjoyed usability and design aspect of software.
I feel a…

Jonathan
- 32,202
- 38
- 137
- 208
3
votes
2 answers
Pytorchvideo Models Resnet Input shape
I am using the following code to load resnet50 but since this is a video. I am not sure what is the expected input. Is it ([batch_size, channels, frames,img1,img2])?
Any help would be fantastic.
import pytorchvideo.models.resnet
def resnet():
…

Rayanxv
- 31
- 1
3
votes
0 answers
iOS vision framework - Unable to setup request in VNDetectHumanBodyPoseRequest
I use VNDetectHumanBodyPoseRequest to detect body from an image which in xcode assets(I download from image website), But I get error below:
2021-12-24 21:50:19.945976+0800 Guess My Exercise[91308:4258893] [espresso] [Espresso::handle_ex_plan]…

zhouxinle
- 429
- 5
- 16
3
votes
0 answers
pytorch faster r-cnn, no bounding box image return value error(all bounding boxes should have positive height and width)
I'm trying to train Faster R-CNN on custom dataset.
When I train with no object image(image without bounding boxes), it returns value error 'All bounding boxes should have positive height and width'.
My dataset contains image, target([xmin, ymin,…

Naeun Lee
- 31
- 2
3
votes
0 answers
Android Image Rectangle Detection with NO OpenCV
I'm baffled that on Android we have to import a 30 MB OpenCV library to detect rectangles in images / video frames. On iOS that is pretty easy using CIDetector.
Has anyone found a solution that isn't OpenCV based? Maybe using Renderscript? I've…

harry248
- 591
- 5
- 8
3
votes
2 answers
Vision framework barcode detection region of interest not working
I am trying to decode barcodes that appear on a region of interest, that is 80% of the screen width and 20% of the screen height and centered on both directions (blue rectangle).
The camera pixel buffer is rotated right.
This is what Apple has to…

Duck
- 34,902
- 47
- 248
- 470
3
votes
1 answer
How to avoid Blur image capture in firebase ML Kit’s Face Detection API
I am using firebase ML Kit’s Face Detection API. And I am using it for detecting a face using front camera of device as shown in the image. align your face with the outline
So, whenever any face comes into the face overlay it captures an image on…

Anup Randhe
- 33
- 5
3
votes
1 answer
Camera for Raspberry Pi 4 integrate with OpenCV
I'm running openCV for some vehicle detection on a Raspberry Pi 4 model B.
I purchased an IDS camera: https://en.ids-imaging.com/download-ueye-emb-hardfloat.html
But integrating it into my code proved too much trouble as OpenCV.VideoCapture could…

AsafD
- 31
- 2
3
votes
0 answers
Implementation of Edge-enhancing diffusion(EED) with Diffusion tensor
I am currently reading Joachim Weickert:Anisotropic Diffusion in Image Processing . There said Perona--Malik filter isn't anisotropic as it didn't use structure tensor. I can implement Perona--Malik but I am having trouble to implementing this edge…

Sayed Sohan
- 1,385
- 16
- 23
3
votes
1 answer
Logic behind choosing weight for weighted loss calculation?
What is the general logic behind choosing the weight for calculating weighted sigmoid cross-entropy loss, or for any weighted loss in case of an imbalanced dataset? The problem domain is based on vision/image classification.

Solaiman Salvi
- 577
- 4
- 9
3
votes
2 answers
How to triangulate a point in 3D space, given coordinate points in 2 image and extrinsic values of the camera
I'm trying to write a function that when given two cameras, their rotation, translation matrices, focal point, and the coordinates of a point for each camera, will be able to triangulate the point into 3D space. Basically, given all the…

Scout721
- 91
- 1
- 3
- 8
3
votes
0 answers
Miscalculation of new function "pcsegdist" in Matlab R2018b
I try to test the new function "pcsegdist" in Matlab R2018b. However, the result is wrong for Segment point cloud into clusters based on Euclidean distance
Example: I test with 3D data points- 1797 points (please see attached test.txt file). It is…

Thanh Ha
- 139
- 1
- 6
3
votes
0 answers
iOS: Convert CMSampleBuffer to UIImage return image with wrong direction and proportion
I am trying to process the realtime video form the iPhone camera by using the function in AVCaptureVideoDataOutputSampleBufferDelegate.
The video had been edited but the direction of the video is changed, and the proportion of the video is…

boboboboboboo
- 41
- 1
- 6
3
votes
0 answers
Count number of people in captured image
I'm using Mobile Vision API to count number of people in a captured image. I took reference from here and here. Following is code snippet i'm using.
FaceDetector detector = new FaceDetector.Builder(getApplicationContext())
…

Srikanth
- 1,555
- 12
- 20
3
votes
2 answers
Scan QRcode with inverted colors using Vision API
After struggling a few hours on making my app detect this QRCode:
I realized that the problem was the in the QRCode appearance. After inverting the colors, the detection was working perfectly..
Is there a way to make Vision API detect the first…

Louis
- 1,913
- 2
- 28
- 41