File size: 2,303 Bytes
c7ea607 279a19a b2ae985 c7ea607 41c67ea 24966aa 41c67ea f232003 41c67ea 2cb7699 2b43e13 41c67ea 2b43e13 41c67ea 2b43e13 418290d 2b43e13 b2ae985 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
---
license: mit
language:
- en
library_name: keras
tags:
- code
pipeline_tag: image-classification
---
<h1>README for Pathway Vision Transformer</h1><br>
<p>PaViT is a Pathway Vision Transformer (PaViT)-based image recognition model developed by Ajibola Emmanuel Oluwaseun. The model is inspired by Google's PaLM (Pathways Language Model) and aims to demonstrate the potential of using few-shot learning techniques in image recognition tasks.</p>
<h1>Model Performance</h1>
PaViT was trained on a 4GB RAM CPU using a dataset of 15000 Kaggle images of 15 classes, achieving a remarkable 88% accuracy with 4 self-attention heads. The model's accuracy further improved to 96% when trained with 12 self-attention heads and 12 linearly stacked linear layers. These results demonstrate the model's impressive performance and fast training speed on a CPU, despite being trained on a relatively small dataset.
<br>The uploaded weight was trained on image dataset of 3 classes (Cat, Dog and Wild animal) </br>
<h1>Usage</h1>
The model can be used for image recognition tasks by using the trained weights provided in the repository. The code can be modified to use custom datasets, and the model's performance can be further improved by adding more self-attention heads and linear layers.
<h1>Contribution</h1>
The author believes that PaViT has the potential to outperform existing Vision Transformer models and is eager to see it continue to evolve through the contributions of developers and other contributors.
<br></br>
Contributions to the project are welcome and can be made through pull requests. Developers can also report issues or suggest new features for the project.
<h1>License</h1>
<p>This project is licensed under the MIT License.</p>
<h1>How to use:</h1>
```ruby
#import Libraries
!pip install huggingface_hub["tensorflow"]
import matplotlib.pyplot as plt
import cv2
from huggingface_hub import from_pretrained_keras
```
<h1>On inference</h1><br>
```ruby
#load model
model=from_pretrained_keras('Ajibola/PaViT')
#load image
image=cv2.imread('image_path')
image=cv2.resize(image, (224, 224)) #224 is the default image size
image=image.image.max() #Normalize the image to [0-1]
prediction=model.predict(image)
prediction=np.argmax(prediction, axis=-1) #Get Highest probability class
``` |