Revision as of 11:41, 23 June 2021

Emerging Thoughts on Benchmarking

The following would be natural places to start. For most of these, while we do have baseline numbers to compare to, we do not have existing set-ups where we could simply plug in a Norwegian BERT and rund, so we may need to identify suitable code for existing BERT-based architectures for e.g. English to re-use. For the first task though (document-level SA on NoReC) Jeremy would have an existing set-up for using mBERT that we could perhaps use.

NLP tasks

NoReC*

NoReC_fine: structured sentiment analysis
NoReC_sentences sentence-level 2/3-way polarity
NoReC_neg: negation cues and scopes

Linguistic pipeline (dependency parsing or PoS tagging)

Lexical

Text classification

NoReC; document-level ratings.
Talk of Norway
NorDial

@@ Line 15: / Line 15: @@
 *[https://github.com/UniversalDependencies/UD_Norwegian-NynorskLIA Spoken dialects]
-== Lexical semantic ==
+== Lexical  ==
 *[https://www.nb.no/sprakbanken/en/resource-catalogue/oai-nb-no-sbr-27/ Word sense disambiguation in context]
+*[https://github.com/ltgoslo/norwegian-synonyms Norwegian synonyms]
+*[https://github.com/ltgoslo/norwegian-analogies Norwegian analogies]
+*[https://github.com/ltgoslo/norsentlex NorSentLex]: Sentiment lexicon
 == Text classification ==

Difference between revisions of "Eosc/norbert/benchmark"

Revision as of 11:41, 23 June 2021

Contents

Emerging Thoughts on Benchmarking

NLP tasks

NoReC*

Linguistic pipeline (dependency parsing or PoS tagging)

Lexical

Text classification

Other

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools