apple
/

DepthPro-hf

@@ -86,7 +86,7 @@ post_processed_output = image_processor.post_process_depth_estimation(
 field_of_view = post_processed_output[0]["field_of_view"]
 focal_length = post_processed_output[0]["focal_length"]
 depth = post_processed_output[0]["predicted_depth"]
-depth = (depth - depth.min()) / depth.max()
 depth = depth * 255.
 depth = depth.detach().cpu().numpy()
 depth = Image.fromarray(depth.astype("uint8"))
@@ -131,7 +131,7 @@ The `DepthProEncoder` further uses two encoders:
 - `image_encoder`
    - Input image is also rescaled to `patch_size` and processed by the **`image_encoder`**
-Both these encoders can be configured via `patch_model_config` and `image_model_config` respectively, both of which are seperate `Dinov2Model` by default.
 Outputs from both encoders (`last_hidden_state`) and selected intermediate states (`hidden_states`) from **`patch_encoder`** are fused by a `DPT`-based `FeatureFusionStage` for depth estimation.

 field_of_view = post_processed_output[0]["field_of_view"]
 focal_length = post_processed_output[0]["focal_length"]
 depth = post_processed_output[0]["predicted_depth"]
+depth = (depth - depth.min()) / (depth.max() - depth.min())
 depth = depth * 255.
 depth = depth.detach().cpu().numpy()
 depth = Image.fromarray(depth.astype("uint8"))
 - `image_encoder`
    - Input image is also rescaled to `patch_size` and processed by the **`image_encoder`**
+Both these encoders can be configured via `patch_model_config` and `image_model_config` respectively, both of which are separate `Dinov2Model` by default.
 Outputs from both encoders (`last_hidden_state`) and selected intermediate states (`hidden_states`) from **`patch_encoder`** are fused by a `DPT`-based `FeatureFusionStage` for depth estimation.