SZTAKI HLT | Márton Makrai

SZTAKI HLT | Márton Makrai

Márton Makrai

Márton Makrai is a computational linguist, he received his PhD thesis at the NYTK--ELTE Theoretical Linguistics Programme in 2024, and his MSc from the Budapest University of Technology and Economics in mathematics in 2010. In the past years, he worked primarily on semantics and the machine learning of word ambiguity, and now he is a junior research fellow at the HUN-REN TTK Institute of Cognitive Neuroscience and Psychology, where he is working on fine-tuning deep speech and language models.

The Hungarian version of this page may be more clear for people who are unfamiliar with natural language processing, but speak Hungarian.

Topics:

master's thesis on a mathematical model of language acquisition (identification in the limit)
the 4lang semantic network: gold meaning representations of the defining vocabulary, deep cases, spreading activation, and identifying the defining vocabulary with IR and lexicography.
semantic granularity of multi-sense word embeddings
hypernyms with sparse word representations, winning some categories at the SemEval 2018 task.
word translation: combining the method of triangulation with the linear mapping of vectors -- German--Hungarian word pairs with confidence scores
Hungarian equivalents of the analogy question set for evaluating embeddings
multilingual sentence clustering for the CoALa project
young researcher (2015--2018, MTA NYI)

Personal page

CV

Blog

Contact:

makrai.hlt@gmail.com

official home page

Publications
Projects
Resources
Software
Students
Teaching

Full list of publications

Recent publications

2024

M. Makrai: Symbolic and Distributed Word Representations -- Chapters on lexical relations and cross-lingual methods. In public defense of the Phd thesis.

2022

M. Makrai: Symbolic and distributed word representations. In pre-defence of the PhD thesis. M. Makrai: Three-order normalized PMI and other lessons in tensor analysis of verbal selectional preferences. In XVIII. Magyar Számítógépes Nyelvészeti Konferencia. M. Makrai, B. Ehmann, L. Balázs: Topic discovery in the diaries of Antarctica winteroverers with multilingual deep sentence encoders. In 7th International Conference on Research, Technology and Education of Space (H-SPACE 2022) “New trends in the space sector”. M. Makrai, Á. Tündik, B. Indig, G. Szaszák: Towards abstractive summarization in Hungarian. In XVIII. Magyar Számítógépes Nyelvészeti Konferencia.

2021

M. Makrai: Az EFNILEX és egy fiatal kutató -- Hat év magyar szóbeágyazásokkal. In A korpusznyelvészettől a neurális hálókig -- Köszöntő kötet Váradi Tamás 70. születésnapjára. Á. Feldmann, R. Hajdu, B. Indig, B. Sass, M. Makrai, I. Mittelholcz, D. Halász, Z. Yang, T. Váradi: HILBERT, magyar nyelvű BERT-large modell tanítása felhő környezetben. In XVII. Magyar Számítógépes Nyelvészeti Konferencia. M. Makrai, G. Szaszák: Magyar hírek kivonatolása előtanított mély nyelvmodellel – tervek. In Digitális örökség és mesterséges intelligencia konferencia.

2020

M. Makrai: Tárgyas szerkezetek elemzése tenzorfelbontással – áttekintő cikk [Tensor decomposition for transitive verb structure analysis -- a review]. In XVI. Magyar Számítógépes Nyelvészeti Konferencia.

2019

B. Döbrössy, M. Makrai, B. Tarján, G. Szaszák: Investigating sub-word embedding strategies for the morphologically rich and free phrase-order Hungarian. In Proc Repl4NLP. B. Indig, B. Sass, E. Simon, I. Mittelholcz, N. Vadász, M. Makrai: One format to rule them all – The emtsv pipeline for Hungarian. In Proc The 13th Linguistic Annotation Workshop.

2018

G. Berend, M. Makrai, P. Földiák: 300-sparsans at SemEval-2018 Task 9: Hypernymy as interaction of sparse attributes. In SemEval. M. Makrai, B. Sass: A szöveg mint skálafüggetlen hálózat. In XIV. Magyar Számítógépes Nyelvészeti Konferencia. J. Ács, G. Borbély, M. Makrai, D. Nemeskey, G. Recski, A. Kornai: Hibrid nyelvtechnológiák. In Magyar Tudomány 2018/6.

2017

M. Makrai, V. Lipp: Do multi-sense word embeddings learn more senses?. In K + K = 120.

2016

G. Borbély, A. Kornai, M. Makrai, D. Nemeskey: Evaluating multi-sense embeddings for semantic resolution monolingually and in word translation. In repeval. M. Makrai: Filtering Wiktionary triangles by linear mapping between distributed word models. In Proceedings of 10th Edition of the Language Resources and Evaluation Conference.

2015

M. Makrai: Comparison of distributed language models on medium-resourced languages. In XI. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY 2015). A. Kornai, J. Ács, M. Makrai, D. Nemeskey, K. Pajkossy, G. Recski: Competence in lexical semantics. In Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics. M. Makrai: Disambiguated linear word translation in medium European languages. In IEEE 6th International Conference on Cognitive Infocommunications – CogInfoCom 2015.

2014

M. Makrai: Causality in vectors space language models. In Spring Wind. M. Makrai: Deep cases in the 4lang concept lexicon [Mélyesetek a 4lang fogalmi szótárban]. In X. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY 2014). M. Makrai: Mélyesetek a 4lang fogalmi szótárban. In X. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY 2014). M. Makrai: Vector space language models for psycholinguistic analysis. In Corpus resources for quantitative and psycholinguistic analysis.

2013

A. Kornai, M. Makrai: A 4lang fogalmi szótár. In IX. Magyar Számitógépes Nyelvészeti Konferencia. M. Makrai, D. Nemeskey, A. Kornai: Applicative structure in vector space models. In Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality. M. Makrai: Fogalmak fontossága a definíciós gráf vizsgálatával [Importance of concepts based on the analysis of the definition graph]. In VII. Alkalmazott Nyelvészeti Doktoranduszkonferencia. D. Nemeskey, G. Recski, M. Makrai, A. Zséder, A. Kornai: Spreading activation in language understanding. In Proc. CSIT 2013.

2010

M. Makrai: Subregular Categorial Grammars. In MSc thesis.

2007

M. Makrai: Többértelműségek magyar mondatok számítógépes elemzésében - a „meg” szó szófajának vizsgálata gyakoriságokkal.

Project leader

Hungarian word embeddings Word ambiguity efnilex-vect

Participant

4lang Semantics-based language technology digital language vitality

Author

4lang concept dictionary

Számítógépes nyelvészet (2021/2022 spring) Computational Lexical Semantics 2. (2018/2019 fall) Vector space models of word meaning (2017/2018 spring) Computational Lexical Semantics -- Symbolic Representations (2017/2018 fall) Meaning representation (2015/2016 fall) Digital language description (2014/2015 spring) Efficient methods in language description (2014/2015 fall)

Publications

Projects

4lang Semantics-based language technology digital language vitality

Teaching

Computational Lexical Semantics -- Symbolic Representations Computational Lexical Semantics 2. Digital language description Efficient methods in language description Meaning representation Számítógépes nyelvészet Vector space models of word meaning