optimizing hyperparameters for blueCarbon model
Browse files- 1_Pooling/config.json +2 -1
- README.md +28 -133
- config.json +1 -1
- config_sentence_transformers.json +3 -1
- config_setfit.json +2 -2
- model.safetensors +1 -1
- model_head.pkl +2 -2
- special_tokens_map.json +2 -2
1_Pooling/config.json
CHANGED
@@ -5,5 +5,6 @@
|
|
5 |
"pooling_mode_max_tokens": false,
|
6 |
"pooling_mode_mean_sqrt_len_tokens": false,
|
7 |
"pooling_mode_weightedmean_tokens": false,
|
8 |
-
"pooling_mode_lasttoken": false
|
|
|
9 |
}
|
|
|
5 |
"pooling_mode_max_tokens": false,
|
6 |
"pooling_mode_mean_sqrt_len_tokens": false,
|
7 |
"pooling_mode_weightedmean_tokens": false,
|
8 |
+
"pooling_mode_lasttoken": false,
|
9 |
+
"include_prompt": true
|
10 |
}
|
README.md
CHANGED
@@ -7,112 +7,29 @@ tags:
|
|
7 |
- generated_from_setfit_trainer
|
8 |
metrics:
|
9 |
- accuracy
|
10 |
-
widget:
|
11 |
-
- text: interest in third generation biomass such as macroalgae has increased due
|
12 |
-
to their high biomass yield, absence of lignin in their tissues, lower competition
|
13 |
-
for land and fresh water, no fertilization requirements, and efficient co2 capture
|
14 |
-
in coastal ecosystems. however, several challenges still exist in the development
|
15 |
-
of cost effective technologies for processing large amounts of macroalgae. recently,
|
16 |
-
genetically modified micro organisms able to convert brown macroalgae carbohydrates
|
17 |
-
into bioethanol were developed, but still no attempt to scale up production has
|
18 |
-
been proposed. based on giant kelp farming and bioethanol production program carried
|
19 |
-
out in chile, we were able to test and adapt this technology as first attempt
|
20 |
-
to scale up this process using 75 fermentation of genetically modified escherichia
|
21 |
-
coli. laboratory fermentation tests results showed that although biomass growth
|
22 |
-
and yield are not greatly affected by the alginate mannitol ratio, ethanol yield
|
23 |
-
showed clear maximum around alginate mannitol ratio. in . pyrifera, much greater
|
24 |
-
proportion of alginate and lower mannitol abundance is found. in order to make
|
25 |
-
the most of the carbohydrates available for fermentation, we developed four stage
|
26 |
-
process model for scaling up, including acid leaching, depolymerization, saccharification,
|
27 |
-
and fermentation steps. using this process, we obtained .213 kg ethanol kg dry
|
28 |
-
macroalgae, equivalent to . of ethanol hectare year, reaching 64 of the maximum
|
29 |
-
theoretical ethanol yield. we propose strategies to increase this yield, including
|
30 |
-
synthetic biology pathway engineering approaches and process optimization targets.
|
31 |
-
2016 society of chemical industry and john wiley sons, ltd
|
32 |
-
- text: producing concrete that incorporates carbon dioxide into the mix is leveraged
|
33 |
-
to reduce the carbon footprint and produce more sustainable concrete. as the concrete
|
34 |
-
dries, the co2 is mineralized and permanently incorporated into the early carbonation.
|
35 |
-
experimental work has been conducted, and hundreds of specimens with varying ratios
|
36 |
-
of co2 to binder content were cast. co2 to binder ratios of were used to test
|
37 |
-
concrete in workability , mechanical properties , and durability performance .
|
38 |
-
the chemical tests were also conducted to identify the changes in hardened concrete
|
39 |
-
composition for the three mixes . all specimens were field cured and exposed to
|
40 |
-
the coastal environment of ras al khair industrial city in saudi arabia. the results
|
41 |
-
showed that the co2 to binder ratio of . improved the concrete properties, in
|
42 |
-
particular, the effect was clear with higher slump and comparable strength compared
|
43 |
-
to the standard concrete without co2. however, the co2 to binder ratio of . shows
|
44 |
-
negligible increase in the chloride permeability and the internal chloride ion
|
45 |
-
content compared to the standard concrete without co2, whereas the internal sulfate
|
46 |
-
ion content has not increased for both co2 to binder ratios in comparison with
|
47 |
-
the standard concrete without co2, which indicate no reduction in concrete durability.
|
48 |
-
2023 isec press.
|
49 |
-
- text: mangroves are ecosystems made up of trees or shrubs that develop in the intertidal
|
50 |
-
zone and provide many vital environmental services for livelihoods in coastal
|
51 |
-
areas. they are habitat for the reproduction of several marine species. they afford
|
52 |
-
protection from hurricanes, tides, sea level rise and prevent the erosion of the
|
53 |
-
coasts. just one hectare of mangrove forest can hold up to ,000 tons of carbon
|
54 |
-
dioxide, more than tropical forests and jungles. mexico is one of the countries
|
55 |
-
with the greatest abundance of mangroves in the world, with more than 700,000
|
56 |
-
ha. blue carbon can be novel mechanism for promoting communication and cooperation
|
57 |
-
between the investor, the government, the users, and beneficiaries of the environmental
|
58 |
-
services of these ecosystems, creating public private social partnerships through
|
59 |
-
mechanisms such as payment for environmental services, credits, or the voluntary
|
60 |
-
carbon market. this chapter explores the possibilities of incorporating blue carbon
|
61 |
-
in emissions markets. we explore the huge potential of mexico blue carbon to sequester
|
62 |
-
co2. then we analyse the new market instrument that allows countries to sell or
|
63 |
-
transfer mitigation results internationally the sustainable development mechanism
|
64 |
-
, established in the paris agreement. secondly, we present the progress of the
|
65 |
-
commission for environmental cooperation to standardize the methodologies to assess
|
66 |
-
their stock and determine the magnitude of the blue carbon sinks. thirdly, as
|
67 |
-
an opportunity for mexico, the collaboration with the california cap and trade
|
68 |
-
program is analysed. we conclude that blue carbon is very important mitigation
|
69 |
-
tool to be included in the compensation schemes on regional and global levels.
|
70 |
-
additionally, mangrove protection is an excellent example of the mitigation adaptation
|
71 |
-
sustainable development relationship, as well as fostering of governance by the
|
72 |
-
inclusion of the coastal communities in decision making and incomes. 2022, the
|
73 |
-
author.
|
74 |
-
- text: featured application the findings obtained from this study have implications
|
75 |
-
for global blue carbon budgeting. abstract field monitoring and incubation experiments
|
76 |
-
were conducted to evaluate the litter yield and examine the decomposition of the
|
77 |
-
litter of three representative mangrove species frequently used for mangrove re
|
78 |
-
vegetation in subtropical mudflat on the south china coast. the results show that
|
79 |
-
the litter yield of the investigated mangrove species varied significantly from
|
80 |
-
season to season. the annual litter production was in the following decreasing
|
81 |
-
order heritiera littoralis thespesia populnea kandelia obovata. initially, rapid
|
82 |
-
decomposition of easily degradable components of the litter materials resulted
|
83 |
-
in marked weight loss of the mangrove litter. there was good linear relationship
|
84 |
-
between the length of field incubation time and the litter decomposition rate
|
85 |
-
for both the branch and the leaf portion of the three investigated mangrove species.
|
86 |
-
approximately 50 or more of the added mangrove litter could be decomposed within
|
87 |
-
one year and the decomposed litter could be incorporated into the underlying soils
|
88 |
-
and consequently affect the soil carbon dynamics. an annual soil carbon increase
|
89 |
-
from .37 to .64 kg in the top cm of the soil was recorded for the investigated
|
90 |
-
mangrove species.
|
91 |
-
- text: seagrasses provide multitude of ecosystem services and serve as important
|
92 |
-
organic carbon stores. however, seagrass habitats are declining worldwide, threatened
|
93 |
-
by global climate change and regional shifts in water quality. acoustical methods
|
94 |
-
have been applied to assess changes in oxygen production of seagrass meadows since
|
95 |
-
sound propagation is sensitive to the presence of bubbles, which exist both within
|
96 |
-
the plant tissue and freely floating the water as byproducts of photosynthesis.
|
97 |
-
this work applies acoustic remote sensing techniques to characterize two different
|
98 |
-
regions of seagrass meadow densely vegetated meadow of thalassia testudinum and
|
99 |
-
sandy region sparsely populated by isolated stands of . testudinum. bayesian approach
|
100 |
-
is applied to estimate the posterior probability distributions of the unknown
|
101 |
-
model parameters. the sensitivity of sound to the void fraction of gas present
|
102 |
-
in the seagrass meadow was established by the narrow marginal probability distributions
|
103 |
-
that provided distinct estimates of the void fraction between the two sites. the
|
104 |
-
absolute values of the estimated void fractions are biased by limitations in the
|
105 |
-
forward model, which does not capture the full complexity of the seagrass environment.
|
106 |
-
nevertheless, the results demonstrate the potential use of acoustical methods
|
107 |
-
to remotely sense seagrass health and density.
|
108 |
pipeline_tag: text-classification
|
109 |
inference: false
|
110 |
base_model: sentence-transformers/paraphrase-mpnet-base-v2
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
111 |
---
|
112 |
|
113 |
# SetFit with sentence-transformers/paraphrase-mpnet-base-v2
|
114 |
|
115 |
-
This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) as the Sentence Transformer embedding model. A
|
116 |
|
117 |
The model has been trained using an efficient few-shot learning technique that involves:
|
118 |
|
@@ -124,7 +41,7 @@ The model has been trained using an efficient few-shot learning technique that i
|
|
124 |
### Model Description
|
125 |
- **Model Type:** SetFit
|
126 |
- **Sentence Transformer body:** [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2)
|
127 |
-
- **Classification head:** a
|
128 |
- **Maximum Sequence Length:** 512 tokens
|
129 |
<!-- - **Number of Classes:** Unknown -->
|
130 |
<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
|
@@ -137,6 +54,13 @@ The model has been trained using an efficient few-shot learning technique that i
|
|
137 |
- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
|
138 |
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
|
139 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
140 |
## Uses
|
141 |
|
142 |
### Direct Use for Inference
|
@@ -155,7 +79,7 @@ from setfit import SetFitModel
|
|
155 |
# Download from the 🤗 Hub
|
156 |
model = SetFitModel.from_pretrained("ignaciosg/blueCarbon")
|
157 |
# Run inference
|
158 |
-
preds = model("
|
159 |
```
|
160 |
|
161 |
<!--
|
@@ -184,42 +108,13 @@ preds = model("featured application the findings obtained from this study have i
|
|
184 |
|
185 |
## Training Details
|
186 |
|
187 |
-
### Training Set Metrics
|
188 |
-
| Training set | Min | Median | Max |
|
189 |
-
|:-------------|:----|:---------|:----|
|
190 |
-
| Word count | 80 | 236.0127 | 453 |
|
191 |
-
|
192 |
-
### Training Hyperparameters
|
193 |
-
- batch_size: (1, 1)
|
194 |
-
- num_epochs: (1, 1)
|
195 |
-
- max_steps: 1
|
196 |
-
- sampling_strategy: oversampling
|
197 |
-
- num_iterations: 1
|
198 |
-
- body_learning_rate: (2e-05, 1e-05)
|
199 |
-
- head_learning_rate: 0.01
|
200 |
-
- loss: CosineSimilarityLoss
|
201 |
-
- distance_metric: cosine_distance
|
202 |
-
- margin: 0.25
|
203 |
-
- end_to_end: False
|
204 |
-
- use_amp: False
|
205 |
-
- warmup_proportion: 0.1
|
206 |
-
- max_length: 750
|
207 |
-
- seed: 42
|
208 |
-
- eval_max_steps: 1
|
209 |
-
- load_best_model_at_end: False
|
210 |
-
|
211 |
-
### Training Results
|
212 |
-
| Epoch | Step | Training Loss | Validation Loss |
|
213 |
-
|:------:|:----:|:-------------:|:---------------:|
|
214 |
-
| 0.0001 | 1 | 0.2289 | - |
|
215 |
-
|
216 |
### Framework Versions
|
217 |
- Python: 3.10.12
|
218 |
- SetFit: 1.0.3
|
219 |
-
- Sentence Transformers: 2.
|
220 |
-
- Transformers: 4.
|
221 |
- PyTorch: 2.1.0+cu121
|
222 |
-
- Datasets: 2.
|
223 |
- Tokenizers: 0.15.2
|
224 |
|
225 |
## Citation
|
|
|
7 |
- generated_from_setfit_trainer
|
8 |
metrics:
|
9 |
- accuracy
|
10 |
+
widget: []
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
pipeline_tag: text-classification
|
12 |
inference: false
|
13 |
base_model: sentence-transformers/paraphrase-mpnet-base-v2
|
14 |
+
model-index:
|
15 |
+
- name: SetFit with sentence-transformers/paraphrase-mpnet-base-v2
|
16 |
+
results:
|
17 |
+
- task:
|
18 |
+
type: text-classification
|
19 |
+
name: Text Classification
|
20 |
+
dataset:
|
21 |
+
name: Unknown
|
22 |
+
type: unknown
|
23 |
+
split: test
|
24 |
+
metrics:
|
25 |
+
- type: accuracy
|
26 |
+
value: 0.1502397442727757
|
27 |
+
name: Accuracy
|
28 |
---
|
29 |
|
30 |
# SetFit with sentence-transformers/paraphrase-mpnet-base-v2
|
31 |
|
32 |
+
This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) as the Sentence Transformer embedding model. A OneVsRestClassifier instance is used for classification.
|
33 |
|
34 |
The model has been trained using an efficient few-shot learning technique that involves:
|
35 |
|
|
|
41 |
### Model Description
|
42 |
- **Model Type:** SetFit
|
43 |
- **Sentence Transformer body:** [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2)
|
44 |
+
- **Classification head:** a OneVsRestClassifier instance
|
45 |
- **Maximum Sequence Length:** 512 tokens
|
46 |
<!-- - **Number of Classes:** Unknown -->
|
47 |
<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
|
|
|
54 |
- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
|
55 |
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
|
56 |
|
57 |
+
## Evaluation
|
58 |
+
|
59 |
+
### Metrics
|
60 |
+
| Label | Accuracy |
|
61 |
+
|:--------|:---------|
|
62 |
+
| **all** | 0.1502 |
|
63 |
+
|
64 |
## Uses
|
65 |
|
66 |
### Direct Use for Inference
|
|
|
79 |
# Download from the 🤗 Hub
|
80 |
model = SetFitModel.from_pretrained("ignaciosg/blueCarbon")
|
81 |
# Run inference
|
82 |
+
preds = model("I loved the spiderman movie!")
|
83 |
```
|
84 |
|
85 |
<!--
|
|
|
108 |
|
109 |
## Training Details
|
110 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
111 |
### Framework Versions
|
112 |
- Python: 3.10.12
|
113 |
- SetFit: 1.0.3
|
114 |
+
- Sentence Transformers: 2.5.1
|
115 |
+
- Transformers: 4.38.1
|
116 |
- PyTorch: 2.1.0+cu121
|
117 |
+
- Datasets: 2.18.0
|
118 |
- Tokenizers: 0.15.2
|
119 |
|
120 |
## Citation
|
config.json
CHANGED
@@ -19,6 +19,6 @@
|
|
19 |
"pad_token_id": 1,
|
20 |
"relative_attention_num_buckets": 32,
|
21 |
"torch_dtype": "float32",
|
22 |
-
"transformers_version": "4.
|
23 |
"vocab_size": 30527
|
24 |
}
|
|
|
19 |
"pad_token_id": 1,
|
20 |
"relative_attention_num_buckets": 32,
|
21 |
"torch_dtype": "float32",
|
22 |
+
"transformers_version": "4.38.1",
|
23 |
"vocab_size": 30527
|
24 |
}
|
config_sentence_transformers.json
CHANGED
@@ -3,5 +3,7 @@
|
|
3 |
"sentence_transformers": "2.0.0",
|
4 |
"transformers": "4.7.0",
|
5 |
"pytorch": "1.9.0+cu102"
|
6 |
-
}
|
|
|
|
|
7 |
}
|
|
|
3 |
"sentence_transformers": "2.0.0",
|
4 |
"transformers": "4.7.0",
|
5 |
"pytorch": "1.9.0+cu102"
|
6 |
+
},
|
7 |
+
"prompts": {},
|
8 |
+
"default_prompt_name": null
|
9 |
}
|
config_setfit.json
CHANGED
@@ -1,4 +1,4 @@
|
|
1 |
{
|
2 |
-
"
|
3 |
-
"
|
4 |
}
|
|
|
1 |
{
|
2 |
+
"labels": null,
|
3 |
+
"normalize_embeddings": false
|
4 |
}
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 437967672
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8cb0abee1b3ccaf4776107b24d1b77598c93bc13a12d9e6dea65fe9b1657c963
|
3 |
size 437967672
|
model_head.pkl
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:90f1d4020f711d7b81d1c955d4e7c42e73ded0c0c3ffc9a679832d8e9e4205bf
|
3 |
+
size 195396
|
special_tokens_map.json
CHANGED
@@ -9,7 +9,7 @@
|
|
9 |
"cls_token": {
|
10 |
"content": "<s>",
|
11 |
"lstrip": false,
|
12 |
-
"normalized":
|
13 |
"rstrip": false,
|
14 |
"single_word": false
|
15 |
},
|
@@ -37,7 +37,7 @@
|
|
37 |
"sep_token": {
|
38 |
"content": "</s>",
|
39 |
"lstrip": false,
|
40 |
-
"normalized":
|
41 |
"rstrip": false,
|
42 |
"single_word": false
|
43 |
},
|
|
|
9 |
"cls_token": {
|
10 |
"content": "<s>",
|
11 |
"lstrip": false,
|
12 |
+
"normalized": false,
|
13 |
"rstrip": false,
|
14 |
"single_word": false
|
15 |
},
|
|
|
37 |
"sep_token": {
|
38 |
"content": "</s>",
|
39 |
"lstrip": false,
|
40 |
+
"normalized": false,
|
41 |
"rstrip": false,
|
42 |
"single_word": false
|
43 |
},
|