General background to the topic
Many studies suggest that sorghum geographical distribution is related to farmers’ linguistic identity (Harlan and Stemler, 1976; Leclerc and Coppens d’Eeckenbrugge, 2012; Westengen et al., 2014, Gilabert et al. 2025). As suggested by Harlan and Stemler (1976), indeed, the correspondence between the distribution of the basic morphological groups of sorghum and the major linguistic groups in Africa may not be fortuitous. In 1967, de Wet and Huckabay already underlined that guinea sorghum is associated with Western African farmers, durra sorghum is confined to the Muslim and Arabic farmers along the fringes of the Sahara and Ethiopia (Kimber, 2000), and kafir, which is widely grown in Southern Africa, is reputed to be associated with the Bantu-speaking peoples of Eastern and Southern Africa.
Problematics
One major interdisciplinary hypothesis to explain the correspondence between social farmer and crop genetic differentiation is that linguistic barriers, historically, have favoured seed circulation more within than between linguistic communities (see Leclerc and Coppens d’Eeckenbrugge, 2012 for an overview). Such a social factor should have acting as a biological driver, orienting seed mediated gene flows more within than between linguistic communities. Then, linguistic barriers that were at work in the past and maintained, at least partially, up to present times, could explain why, after thousands of years, sorghum geographical distribution still remain associated today with linguistic groups (Gilabert et al. 2025). Despite sorghum and societies became a well-documented interaction, the structuration of sorghum genetic and morphological diversity, however, also relates with environmental and geographical factors (Deu et al., 2006 ; Morris et al 2013). How social, environmental, and geographical factors respectively contributed to shape sorghum diversify over time remain an untapped area. To answer to this question, the aim of this Master 1 internship is to contribute to analysis passport data, which are associated to sorghum worldwide accessions. More specifically, it aims at documenting the farmer linguistic identity as well as the environmental context where sorghum accessions were collected.
Course of the internship (data acquisition, preliminary analysis, expected results)
Sorghum accessions that will be analysed during the internship will be selected from available Whole Genome Sequencing dataset (either from published studies or from the sequencing of our collections), focussing on geo-referenced landraces from Africa and India. First, the student will look at documenting the farmer linguistic identity by using geographical coordinates of accessions in order to compare data available in international linguistic databases (https://www.ethnologue.com) to those available in passport data. This will imply to paid specific attention to the linguistic nomenclature itself, distinguishing language and dialect levels, and considering potential synonymous between the nomenclature used in databases and the one used in passport data. Second, for each data point, the student will document the environmental context using geographical coordinates of accessions and worldclim data (https://www.worldclim.org). Third, the student will implement clustering analysis to define environmental clusters from which accessions are from, and implement association studies to test weather linguistic groups are associated to environmental clusters.
The student will join the DDSE team (Dynamics of diversity, societies and environments) and will be based at ARCAD (AGAP Institut, Montpellier). The student will be mainly supervised by Christian Leclerc (CIRAD, UMR AGAP Institut, Montpellier). As part of the ANR project Afradapt (Genomic vulnerability of African crops to future climate), the work will be done in close collaboration with the DYNADIV team (UMR DIADE).
Methods, techniques and tools to use
Data manipulation and data analysis imply to use regularly the R software and Git-LAB. The student will produce FAIR data. This Master 1 internship is a good opportunity to acquire capacity in:
R coding and Git-LAB script management
Understanding worldwide linguistic classification and naming system
Using worldwide bioclim databases to characterize environmental contexts
Implementing association studies, using clustering approach and chi-squared statistics
Keywords (max 5)
Crop diversity, social factors, linguistic identity, sorghum
Duration: 3 months
Supervisor(s): Dr. Christian Leclerc
Laboratory: UMR AGAP Institut – Equipe DDSE
Email: christian.leclerc@cirad.fr
Website: https://umr-agap.cirad.fr/en/research/scientific-teams/dynamics-of-diversity-societies-and-environments-ddse/context-and-issues
References (publications by the host team illustrating the general context of the subject)
Gilabert, A., Deu, M., Champion, L., Cubry, P., Donkpegan, A., Rami, J.-F., Pot, D., Vigouroux, Y., & Leclerc, C. (2025). Integrating ethnolinguistic and archaeobotanical data to uncover the origin and dispersal of cultivated sorghum in Africa : A genomic perspective, Peer Community Journal, 5 : E96. Peer Community Journale06, 5(e96).
Kaczmarek, T., Cubry, P., Champion, L., Causse, S., Couderc, M., Orjuela, J., Uyoh, E. A., Oselebe, H. O., Dachi, S. N., Adje, C. O. A., Sekloka, E., Achigan-Dako, E. G., Ibrahim Bio Yerima, A. R., Saidou, S. I., Bakasso, Y., Diop, B. M., Gueye, M. C., Agyare, R. Y., Adjebeng-Danquah, J., … Leclerc, C. (2025). Independent domestication and cultivation histories of two West African indigenous fonio millet crops. Nature Communications, 16(1), 4067. https://doi.org/10.1038/s41467-025-59454-2
Leclerc, C., & Coppens d’Eeckenbrugge, G. (2012). Social Organization of Crop Genetic Diversity. The G × E × S Interaction Model. Diversity, 4(1), 1 32. https://doi.org/10.3390/d4010001
Porcuna-Ferrer, A., Guillerminet, T., Renard, D., Labeyrie, V., Leclerc, C., & Reyes-García, V. (2025). Crop biocultural traits and diversity dynamics among Bassari farmers. Agriculture and Human Values. https://doi.org/10.1007/s10460-025-10725-0
Sarr, A., Bodian, A., Gueye, M. C., Gueye, B., Kanfany, G., Diatta, C., Bougma, L. A., Diop, E. A. M. C., Cissé, N., Diouf, D., & Leclerc, C. (2022). Ethnobotanical study of cowpea (Vigna unguiculata (L.) Walp.) in Senegal. Journal of Ethnobiology and Ethnomedicine, 18(1), 6. https://doi.org/10.1186/s13002-022-00506-y
Bibliography (other publications relevant to the topic)
de Wet, J. M. J., & Huckabay, J. P. (1967). The Origin of Sorghum bicolor. II. Distribution and Domestication. Evolution, 21(4), 787 802. https://doi.org/10.2307/2406774
Deu, M., Rattunde, F., & Chantereau, J. (2006). A global view of genetic diversity in cultivated sorghums using a core collection. Genome, 49(2), Article 2. https://doi.org/10.1139/g05-092
HARLAN, J. R., & STEMLER, A. (1976). The races of sorghum in Africa. In J. R. Harlan, J. M. J. De Wet, & A. Stemler (Éds.), Origins of african plant domestication (p. 465 478). Mouton.
Kimber, C.T., 2000. Origins of domesticated sorghum and its early diffusion to India and China, in: Sorghum: Origin, History, Technology, and Production. John Wiley & Sons, New York, pp. 3–98.
Morris, G. P., Ramu, P., Deshpande, S. P., Hash, C. T., Shah, T., Upadhyaya, H. D., Riera-Lizarazu, O., Brown, P. J., Acharya, C. B., Mitchell, S. E., Harriman, J., Glaubitz, J. C., Buckler, E. S., & Kresovich, S. (2013). Population genomic and genome-wide association studies of agroclimatic traits in sorghum. Proceedings of the National Academy of Sciences, 110(2), Article 2. https://doi.org/10.1073/pnas.1215985110
Westengen, O. T., Okongo, M. A., Onek, L., Berg, T., Upadhyaya, H., Birkeland, S., Khalsa, S. D. K., Ring, K. H., Stenseth, N. C., & Brysting, A. K. (2014). Ethnolinguistic structuring of sorghum genetic diversity in Africa and the role of local seed systems. Proceedings of the National Academy of Sciences, 111(39), Article 39.
Commentaires récents