handaber commited on
Commit
02881e5
·
verified ·
1 Parent(s): ef37d3b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -0
README.md CHANGED
@@ -1,3 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # ImageBind: One Embedding Space To Bind Them All
2
 
3
  **[FAIR, Meta AI](https://ai.facebook.com/research/)**
 
1
+ # ImageBind Models (@ `./checkpoints`):
2
+ - imagebind_huge.pth
3
+ - model.safetensors
4
+ - OpenVino Intermediate Representation Models:
5
+ - Text
6
+ - Vision
7
+ - Audio
8
+ - Thermal
9
+ - Depth
10
+ - [ ] TODO: IMU
11
+ - [ ] TODO: Video
12
+
13
+ ### Updated training assets in `.assets`; thermal and depth need to be converted into greyscale
14
+ ```py
15
+ import torchvision.transforms as transforms
16
+
17
+ # Define a transform to convert RGB images to single-channel
18
+ to_single_channel = transforms.Compose([
19
+ transforms.Grayscale(num_output_channels=1),
20
+ transforms.Resize((224, 224)),
21
+ transforms.ToTensor(),
22
+ ])
23
+
24
+ inputs = {
25
+ ModalityType.TEXT: data.load_and_transform_text(texts, device),
26
+ ModalityType.VISION: data.load_and_transform_vision_data(image_paths, device),
27
+ ModalityType.AUDIO: data.load_and_transform_audio_data(audio_paths, device),
28
+ ModalityType.DEPTH: torch.stack([to_single_channel(Image.open(path)) for path in depth_paths]).to(device),
29
+ ModalityType.THERMAL: torch.stack([to_single_channel(Image.open(path)) for path in thermal_paths]).to(device),
30
+ }
31
+ ...
32
+ ```
33
+
34
+ # === Original: ===
35
+
36
  # ImageBind: One Embedding Space To Bind Them All
37
 
38
  **[FAIR, Meta AI](https://ai.facebook.com/research/)**