Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange values after converting 3D joint positions from world to camera coordinates #29

Open
GianMassimiani opened this issue Apr 8, 2019 · 2 comments

Comments

@GianMassimiani
Copy link

GianMassimiani commented Apr 8, 2019

Hi, thanks for the great dataset! I used some of the code you provided (e.g. for retrieving the camera extrinsic matrix) to convert 3D joint positions from world to camera coordinates. However, the values of joint positions in camera coordinates seem a bit strange to me. Here is the code:

def get_extrinsic_matrix(T):
	# Return the extrinsic camera matrix for SURREAL images
	# Script based on:
	# https://blender.stackexchange.com/questions/38009/3x4-camera-matrix-from-blender-camera
	# Take the first 3 columns of the matrix_world in Blender and transpose.
	# This is hard-coded since all images in SURREAL use the same.
	R_world2bcam = np.array([[0, 0, 1], [0, -1, 0], [-1, 0, 0]]).transpose()
	# *cam_ob.matrix_world = Matrix(((0., 0., 1, params['camera_distance']),
	#                               (0., -1, 0., -1.0),
	#                               (-1., 0., 0., 0.),
	#                               (0.0, 0.0, 0.0, 1.0)))

	# Convert camera location to translation vector used 
	# in coordinate changes
	T_world2bcam = -1 * np.dot(R_world2bcam, T)

	# Following is needed to convert Blender camera to 
	# computer vision camera
	R_bcam2cv = np.array([[1, 0, 0], [0, -1, 0], [0, 0, -1]])

	# Build the coordinate transform matrix from world to 
	# computer vision camera
	R_world2cv = np.dot(R_bcam2cv, R_world2bcam)
	T_world2cv = np.dot(R_bcam2cv, T_world2bcam)

	# Put into 3x4 matrix
	RT = np.concatenate([R_world2cv, T_world2cv], axis=1)
	return RT, R_world2cv, T_world2cv

def world_to_camera(RT,p_w):
        # Get a set of n points in world coordinates (p_w) and
	# convert them to camera coordinates (p_c)
        # Args:
        # p_w (numpy array): points in world coordinates, shape (n, 3, 1)
        # RT  (numpy array) : 4x4 camera extrinsic matrix, shape (4,4)
	n_points = p_w.shape[0]
	ones = np.ones([n_points, 1, 1])
	p_w = np.concatenate((p_w, ones), axis=1)
	
	p_c = np.dot(RT, p_w[0])
	for p in p_w[1:]:
		p_c = np.concatenate((p_c, np.dot(RT, p)))
		
	return p_c.reshape(n_points,3,1)

# Read annotation file
mat = scipy.io.loadmat("./01_06_c0001_info.mat")

# Get camera position in world coordinates
camera_pos = mat['camLoc']

# Get camera extrinsic matrix
extrinsic, _, _ = get_extrinsic_matrix(camera_pos)

# Frame number
frame_id = 10

# Get joints positions in world coordinates
joints3d = mat['joints3D'][:, :, frame_id].T # shape (24, 3)
joints3d = joints3d.reshape([
					joints3d.shape[0],
					joints3d.shape[1],
					1]
			) # shape (24, 3, 1)

# Convert 3D joint positions from world to camera coords
joints3d_cam = world_to_camera(extrinsic, joints3d)

# Reshape to restore previous shape, and permute
joints3d_cam = joints3d_cam.reshape(joints3d_cam.shape[0], 
						joints3d_cam.shape[1]) # shape (24, 3)
joints3d_cam = np.moveaxis(joints3d_cam, 0, 1) # shape (3, 24)

# Swap Y and Z axes
joints3d_cam[[0,1,2]] = joints3d_cam[[0,2,1]]

When plotting the joints in 3D I get strange depth values (see Y axis in the figure below). For example, in the image below the subject appears very close to the camera, however it's position on the Y axis (computed with the above code) is about 6 meters, which seems quite unrealistic to me:
Screenshot from 2019-04-08 11-12-46

Do you have any idea why this is happening? Thanks

@GianMassimiani GianMassimiani changed the title Converting 3D joint positions from world to camera coordinates Strange values after converting 3D joint positions from world to camera coordinates Apr 8, 2019
@GianMassimiani
Copy link
Author

GianMassimiani commented Apr 10, 2019

@gulvarol Hi Gül, I am still confused about the values of 3D joints data:

  1. There seems to be a left/right inconsistency, e.g. the joint with index 4, which is supposed to be the left knee, appears to be the right knee instead. Therefore, when plotting 3D joints a left/right swap is needed.

  2. In this answer you say that the reference point for 3D joint data is the camera location. This means that there is no need to multiply joint positions by the camera extrinsic matrix (as I did in the code above). However, I tried to plot the 3D joint data as stored in the _info.mat files, and they don’t seem to be expressed in camera coordinates. See these sample images:
    test1
    test2
    test3
    test4

@anas-zafar
Copy link

@GianMassimiani were you able to solve the problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants