Instructions to use GEODE/bert-base-multilingual-cased-edda-domain-classification with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use GEODE/bert-base-multilingual-cased-edda-domain-classification with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="GEODE/bert-base-multilingual-cased-edda-domain-classification")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("GEODE/bert-base-multilingual-cased-edda-domain-classification") model = AutoModelForSequenceClassification.from_pretrained("GEODE/bert-base-multilingual-cased-edda-domain-classification") - Notebooks
- Google Colab
- Kaggle
bert-base-multilingual-cased-edda-domain-classification
This model is designed to classify encyclopedia articles into knowledge domains (e.g., History, Geography, Medicine, ...). It is a fine-tuned version of the bert-base-multilingual-cased model. It has been trained on the French Encyclopédie ou dictionnaire raisonné des sciences des arts et des métiers par une société de gens de lettres (1751-1772) edited by Diderot and d'Alembert (provided by the ARTFL Encyclopédie Project).
Model Description
- Developed by: Alice Brenon, Ludovic Moncla, Katherine McDonough, and Khaled Chabane in the framework of the GEODE project.
- Model type: Text classification
- Repository: https://gitlab.liris.cnrs.fr/geode/EDdA-Classification/
- Language(s) (NLP): French
- License: cc-by-nc-4.0
Class labels
%TODO
Bias, Risks, and Limitations
This model was trained entirely on French encyclopaedic entries and will likely not perform well on text in other languages or other corpora.
Cite this work
Brenon, A., Moncla, L., & McDonough, K. (2022). Classifying encyclopedia articles: Comparing machine and deep learning methods and exploring their predictions. Data & Knowledge Engineering, 142, 102098.
Acknowledgement
The authors are grateful to the ASLAN project (ANR-10-LABX-0081) of the Université de Lyon, for its financial support within the French program "Investments for the Future" operated by the National Research Agency (ANR). Data courtesy the ARTFL Encyclopédie Project, University of Chicago.
- Downloads last month
- 21