Spaces:
Runtime error
Runtime error
A newer version of the Streamlit SDK is available:
1.41.1
- The first step in STriPNet is to run Topic Modeling using the BERTopic library.
- BERTopic internally uses Sentence Transformer models to convert text to embeddings, clusters them and extracts keywords from each cluster.
- Specifically, since STriPNet is intended to be used with scientific papers, we're using the SPECTER pretrained sentence transformers model by Allen AI.
- The
Minimum topic size
andN-gram range
parameters control the clustering and keyword extraction of BERTopic respectively. Hover over the tooltip of each parameter to get more information about them. STriPNet internally chooses some heuristically tuned parameters depending on the data you've uploaded. Feel free to play around with the parameters until you get good topics. - You can visualize the quality of the topic modeling in various ways provided by the dropdown menu
Select Topic Modeling Visualization
. - Finally, please take note that BERTopic results change with every run so the topics extracted might change everytime you run STriPNet even on the same data with the same settings. If your topics look weird, a simple page refresh or a $+1$ increase to
Minimum topic size
might fix it!