Instructions to use emiltj/da_dacy_large_DANSK_ner with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- spaCy
How to use emiltj/da_dacy_large_DANSK_ner with spaCy:
!pip install https://huggingface.co/emiltj/da_dacy_large_DANSK_ner/resolve/main/da_dacy_large_DANSK_ner-any-py3-none-any.whl # Using spacy.load(). import spacy nlp = spacy.load("da_dacy_large_DANSK_ner") # Importing as module. import da_dacy_large_DANSK_ner nlp = da_dacy_large_DANSK_ner.load() - Notebooks
- Google Colab
- Kaggle
DaCy_large_DANSK_ner
DaCy is a Danish language processing framework with state-of-the-art pipelines as well as functionality for analyzing Danish pipelines. At the time of publishing this model, also included in DaCy encorporates the only models for fine-grained NER using DANSK dataset - a dataset containing 18 annotation types in the same format as Ontonotes. Moreover, DaCy's largest pipeline has achieved State-of-the-Art performance on Named entity recognition, part-of-speech tagging and dependency parsing for Danish on the DaNE dataset. Check out the DaCy repository for material on how to use DaCy and reproduce the results. DaCy also contains guides on usage of the package as well as behavioural test for biases and robustness of Danish NLP pipelines.
| Feature | Description |
|---|---|
| Name | da_dacy_large_DANSK_ner |
| Version | 0.1.0 |
| spaCy | >=3.5.0,<3.6.0 |
| Default Pipeline | transformer, ner |
| Components | transformer, ner |
| Vectors | 0 keys, 0 unique vectors (0 dimensions) |
| Sources | DANSK - Danish Annotations for NLP Specific TasKs KennethEnevoldsen/dfm-bert-large-v1-2048bsz-1Msteps (Kenneth Enevoldsen) |
| License | apache-2.0 |
| Author | Centre for Humanities Computing Aarhus |
Label Scheme
View label scheme (18 labels for 1 components)
| Component | Labels |
|---|---|
ner |
CARDINAL, DATE, EVENT, FACILITY, GPE, LANGUAGE, LAW, LOCATION, MONEY, NORP, ORDINAL, ORGANIZATION, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK OF ART |
Accuracy
| Type | Score |
|---|---|
ENTS_F |
81.51 |
ENTS_P |
81.00 |
ENTS_R |
82.03 |
TRANSFORMER_LOSS |
63375.61 |
NER_LOSS |
158164.20 |
Performance tables
The table below shows the F1, recall and precision of the three DaCy fine-grained models.
| Score | DaCy large | DaCy medium | DaCy small |
|---|---|---|---|
| F1 | 0.823 | 0.806 | 0.776 |
| Recall | 0.834 | 0.818 | 0.77 |
| Precision | 0.813 | 0.794 | 0.781 |
The table below shows the F1 of the three DaCy fine-grained models within each named entity type.
| Named-entity type | DaCy large | DaCy medium | DaCy small |
|---|---|---|---|
| CARDINAL | 0.874 | 0.781 | 0.887 |
| DATE | 0.846 | 0.859 | 0.867 |
| EVENT | 0.611 | 0.571 | 0.4 |
| FACILITY | 0.545 | 0.533 | 0.468 |
| GPE | 0.893 | 0.838 | 0.794 |
| LANGUAGE | 0.902 | 0.486 | 0.194 |
| LAW | 0.686 | 0.625 | 0.606 |
| LOCATION | 0.633 | 0.737 | 0.581 |
| MONEY | 0.993 | 1 | 0.947 |
| NORP | 0.78 | 0.887 | 0.785 |
| ORDINAL | 0.696 | 0.7 | 0.727 |
| ORGANIZATION | 0.863 | 0.851 | 0.781 |
| PERCENT | 0.923 | 0.96 | 0.96 |
| PERSON | 0.871 | 0.872 | 0.833 |
| PRODUCT | 0.671 | 0.635 | 0.526 |
| QUANTITY | 0.386 | 0.654 | 0.708 |
| TIME | 0.643 | 0.571 | 0.71 |
| WORK OF ART | 0.494 | 0.639 | 0.488 |
The table below shows the F1 of the three DaCy fine-grained models within each domain of texts in DANSK.
| Domain | DaCy large | DaCy medium | DaCy small |
|---|---|---|---|
| All domains combined | 0.823 | 0.806 | 0.776 |
| Conversation | 0.796 | 0.718 | 0.82 |
| Dannet | 0.75 | 0.667 | 1 |
| Legal | 0.852 | 0.854 | 0.866 |
| News | 0.841 | 0.759 | 0.86 |
| Social Media | 0.793 | 0.847 | 0.8 |
| Web | 0.826 | 0.802 | 0.756 |
| Wiki and Books | 0.778 | 0.838 | 0.709 |
- Downloads last month
- 6
Evaluation results
- NER Precisionself-reported0.810
- NER Recallself-reported0.820
- NER F Scoreself-reported0.815