Options
Hilpert, Martin
Résultat de la recherche
Measuring the semantic headedness of English blends with token-based semantic vector space modeling: a corpus-based study
2024-12, Qingnan Meng, Hilpert, Martin
This article analyzes the semantic headedness of English blends with distributional semantics methods. The semantic head of a blend is the source word that transfers its semantic information to the blend as a whole. For example, a sitcom is a kind of comedy. But is FedEx a kind of express, and is wi-fi a kind of fidelity? We use corpus data and token-based semantic vector space modeling in order to address these questions. Specifically, we investigate whether Plag’s ternary division of endocentric, exocentric, and coordinative compounds based on semantic headedness can also be applied to English blends, and whether the general tendency of semantic right-headedness can be observed for all three subtypes. We analyze a dataset of fifty-five blends and their respective source words, using data from the Corpus of Contemporary American English and the English Web Corpus 2021. We measure the degree of semantic similarity between each blend and its two source words. The results show that for most endocentric blends, the hypothesis of semantic right-headedness holds true. At the same time, exocentric blends and coordinative blends are shown to behave differently. We conclude that Plag’s classification offers a useful point of departure for the semantic analysis of blends and that distributional semantics methods can provide new insights into their semantic behavior.
Disentangling modal meanings with distributional semantics
2021-3-25, Hilpert, Martin
This paper investigates the collocational behavior of English modal auxiliaries such as may and might with the aim of finding corpus-based measures that distinguish between different modal expressions and that allow insights into why speakers may choose one over another in a given context. The analysis uses token-based semantic vector space modeling (Heylen et al. 2015, Hilpert and Correia Saavedra 2017) in order to determine whether different modal auxiliaries can be distinguished in terms of their collocational profiles. The analysis further examines whether different senses of the same auxiliary exhibit divergent collocational preferences. The results indicate that near-synonymous pairs of modal expressions, such as may and might or must and have to, differ in their distributional characteristics. Also different senses of the same modal expression, such as deontic and epistemic uses of may, can be distinguished on the basis of distributional information. We discuss these results against the background of previous empirical findings (Hilpert 2016, Flach in press) and theoretical issues such as degrees of grammaticalization (Correia Saavedra 2019) and the avoidance of synonymy (Bolinger 1968).
Disentangling modal meanings with distributional semantics
2020, Hilpert, Martin, Susanne Flach
Abstract This article investigates the collocational behavior of English modal auxiliaries such as may and might with the aim of finding corpus-based measures that distinguish between different modal expressions and that allow insights into why speakers may choose one over another in a given context. The analysis uses token-based semantic vector space modeling (Heylen et al., 2015, Monitoring polysemy. Word space models as a tool for large-scale lexical semantic analysis. Lingua, 157: 153–72; Hilpert and Correia Saavedra, 2017, Using token-based semantic vector spaces for corpus-linguistic analyses: From practical applications to tests of theoretical claims. Corpus Linguistics and Linguistic Theory) in order to determine whether different modal auxiliaries can be distinguished in terms of their collocational profiles. The analysis further examines whether different senses of the same auxiliary exhibit divergent collocational preferences. The results indicate that near-synonymous pairs of modal expressions, such as may and might or must and have to, differ in their distributional characteristics. Also, different senses of the same modal expression, such as deontic and epistemic uses of may, can be distinguished on the basis of distributional information. We discuss these results against the background of previous empirical findings (Hilpert, 2016, Construction Grammar and its Application to English, 2nd edn. Edinburgh: Edinburgh University Press, Flach, in press, Beyond modal idioms and modal harmony: a corpus-based analysis of gradient idiomaticity in modal-adverb collocations. English Language and Linguistics) and theoretical issues such as degrees of grammaticalization (Correia Saavedra, 2019, Measurements of Grammaticalization: Developing a Quantitative Index for the Study of Grammatical Change. PhD Dissertation, Université de Neuchâtel) and the avoidance of synonymy (Bolinger, 1968, Entailment and the meaning of structures. Glossa, 2(2): 119–27).
Corpus linguistics meets historical linguistics and construction grammar: how far have we come, and where do we go from here?
2024-03-23, Hilpert, Martin
This paper aims to give an overview of corpus-based research that investigates processes of language change from the theoretical perspective of Construction Grammar. Starting in the early 2000s, a dynamic community of researchers has come together in order to contribute to this effort. Among the different lines of work that have characterized this enterprise, this paper discusses the respective roles of qualitative approaches, diachronic collostructional analysis, multivariate techniques, distributional semantic models, and analyses of network structure. The paper tries to contextualize these approaches and to offer pointers for future research.
The road ahead for Construction Grammar
2024, Hilpert, Martin
What does the future hold for Construction Grammar? What are the most promising future avenues for research on constructions? This paper addresses the development of Construction Grammar as a theory of language through the perspective of six recent PhD dissertations that explore constructional meaning, the architecture of the constructional network, and the role of language change in a constructional theory of language. The goal of this paper is to establish connections between these ideas, and to spell out how different questions concerning Frame Semantics, distributional semantic methods, priming, nodes and connections, individual differences, and constructional change all contribute to a picture that is bigger than the sum of its parts.
Meaning differences between English clippings and their source words: A corpus-based study
2023, Martin Hilpert, David Correia Saavedra, Jennifer Rains
This paper uses corpus data and methods of distributional semantics in order to study English clippings such as dorm (< dormitory), memo (< memorandum), or quake (< earthquake). We investigate whether systematic meaning differences between clippings and their source words can be detected. The analysis is based on a sample of 50 English clippings. Each of the clippings is represented by a concordance of 100 examples in context that were gathered from the Corpus of Contemporary American English. We compare clippings and their source words both at the aggregate level and in terms of comparisons between individual clippings and their source words. The data show that clippings tend to be used in contexts that represent involved text production, which aligns with the idea that clipped words signal familiarity with their referents. It is further observed that individual clippings and their source words partly diverge in their distributional profiles, reflecting both overlap and differences with regard to their meanings. We interpret these findings against the theoretical background of Construction Grammar and specifically the Principle of No Synonymy.