Difference between revisions of "Eosc/norbert/benchmark"

From Nordic Language Processing Laboratory
Jump to: navigation, search
(Lexical)
(NLP tasks)
 
(7 intermediate revisions by 2 users not shown)
Line 4: Line 4:
  
 
== NLP tasks ==
 
== NLP tasks ==
 
+
* Structured sentiment analysis: [https://github.com/ltgoslo/norec_fine NoReC_fine]   
=== NoReC* ===
+
* Sentence-level 2/3-way polarity: [https://github.com/ltgoslo/norec_sentence/ NoReC_sentences]   
*[https://github.com/ltgoslo/norec_fine NoReC_fine]: structured sentiment analysis  
+
* Negation cues and scopes (evaluation is still being developed): [https://github.com/ltgoslo/norec_neg/ NoReC_neg]
*[https://github.com/ltgoslo/norec_sentence/ NoReC_sentences] sentence-level 2/3-way polarity  
+
* PoS tagging: [https://github.com/UniversalDependencies/UD_Norwegian-NynorskLIA ILA] +  NDT [https://github.com/UniversalDependencies/UD_Norwegian-Bokmaal Bokmaal] / [https://github.com/UniversalDependencies/UD_Norwegian-Nynorsk Nynorsk]
*[https://github.com/ltgoslo/norec_neg/ NoReC_neg]: negation cues and scopes (evaluation is still being developed)
+
* Dependency parsing: [https://github.com/UniversalDependencies/UD_Norwegian-NynorskLIA ILA] + NDT [https://github.com/UniversalDependencies/UD_Norwegian-Bokmaal Bokmaal] / [https://github.com/UniversalDependencies/UD_Norwegian-Nynorsk Nynorsk]  
 
+
* NER: [https://github.com/ltgoslo/norne NorNE] (Bokmål+Nynorsk)
=== Linguistic pipeline (dependency parsing or PoS tagging) ===
+
* Co-reference resolution (annotation ongoing)
*[https://github.com/UniversalDependencies/UD_Norwegian-Bokmaal Bokmaal]
 
*[https://github.com/UniversalDependencies/UD_Norwegian-Nynorsk Nynorsk]
 
*[https://github.com/UniversalDependencies/UD_Norwegian-NynorskLIA Spoken dialects]
 
  
 
== Lexical  ==
 
== Lexical  ==
Line 19: Line 16:
 
*[https://github.com/ltgoslo/norwegian-synonyms Norwegian synonyms] (for static models)
 
*[https://github.com/ltgoslo/norwegian-synonyms Norwegian synonyms] (for static models)
 
*[https://github.com/ltgoslo/norwegian-analogies Norwegian analogies] (for static models)
 
*[https://github.com/ltgoslo/norwegian-analogies Norwegian analogies] (for static models)
*[https://github.com/ltgoslo/norsentlex NorSentLex]: Sentiment lexicon
+
*[https://github.com/ltgoslo/norsentlex NorSentLex]: Sentiment lexicon (for static models)
  
 
== Text classification ==
 
== Text classification ==

Latest revision as of 11:56, 23 June 2021

Emerging Thoughts on Benchmarking

The following would be natural places to start. For most of these, while we do have baseline numbers to compare to, we do not have existing set-ups where we could simply plug in a Norwegian BERT and rund, so we may need to identify suitable code for existing BERT-based architectures for e.g. English to re-use. For the first task though (document-level SA on NoReC) Jeremy would have an existing set-up for using mBERT that we could perhaps use.

NLP tasks

Lexical

Text classification

Other