We have currently published the following linguistic datasets. Please refer to the associated publications for their details.
| Name | Empirical Domain | Sample Languages | Data source | Publication |
| MultiCoS | connectives | 24 languages | Elicitation | LREC 2026 |
| MECORE-EN | clause-embedding predicates | English | Web-crawled corpora | SCiL 2025 |
| MODALS | modal auxiliaries | 24 languages | Elicitation | Linguistic Variation 2024 |
| MECORE-XLing | clause-embedding predicates | 14 languages | Elicitation | SigTyp 2023 |