Scholar's Hub

Stay Ahead of the Curve: Trending Papers

With the volume of new research published every day, we understand how hard it is to stay up-to-date with the latest discoveries. That’s why we curated this collection of trending papers for you.

This list is curated based on the citations each paper receives per month and the page traffic growth on Semantic Scholar, along with other criteria. Currently only available for Biology, Computer Science, Medicine, Physics, and Psychology papers.

If you have any feedback or suggestions, please contact us.

Last updated: December 21st, 2023

Illustration: Trending Papers
Biology

De novo design of protein structure and function with RFdiffusion

  • Watson et al. 2023

  • Nature

  • 2023

There has been considerable recent progress in designing new proteins using deep-learning methods1–9. Despite this progress, a general deep-learning framework for protein design that enables solution of a wide range of design challenges, including de novo binder design and design of higher-order symmetric architectures, has yet to be described. Diffusion models10,11 have had considerable success in image and language generative modelling but limited success when applied to protein modelling, probably due to the complexity of protein backbone geometry and sequence–structure relationships. Here we show that by fine-tuning the RoseTTAFold structure prediction network on protein structure denoising tasks, we obtain a generative model of protein backbones that achieves outstanding performance on unconditional and topology-constrained protein monomer design, protein binder design, symmetric oligomer design, enzyme active site scaffolding and symmetric motif scaffolding for therapeutic and metal-binding protein design. We demonstrate the power and generality of the method, called RoseTTAFold diffusion (RFdiffusion), by experimentally characterizing the structures and functions of hundreds of designed symmetric assemblies, metal-binding proteins and protein binders. The accuracy of RFdiffusion is confirmed by the cryogenic electron microscopy structure of a designed binder in complex with influenza haemagglutinin that is nearly identical to the design model. In a manner analogous to networks that produce images from user-specified inputs, RFdiffusion enables the design of diverse functional proteins from simple molecular specifications. Fine-tuning the RoseTTAFold structure prediction network on protein structure denoising tasks yields a generative model for protein design that achieves outstanding performance on a wide range of protein structure and function design challenges.

TLDR

By fine-tuning the RoseTTAFold structure prediction network on protein structure denoising tasks yields a generative model for protein design that achieves outstanding performance on a wide range of protein structure and function design challenges.

Metabolomics combined with physiology and transcriptomics reveal key metabolic pathway responses in apple plants exposure to different selenium concentrations.

  • Liu et al. 2023

  • Journal of hazardous materials

  • 2023

Selenium (Se) can be absorbed by plants, thereby affects plant physiological activity, interferes gene expression, alters metabolite content and influences plant growth. However, the molecular mechanism underlying the plant response to Se remains unclear. In this study, apple plants were exposed to Se at concentrations of 0, 3, 6, 9, 12, 24, and 48 μM. Low concentrations of Se promoted plant growth, while high Se concentrations (≥24 μM) reduced photosynthesis, disturbed carbon and nitrogen metabolism, damaged the antioxidant system, and ultimately inhibited plant growth. The transcriptome and metabolome revealed that Se mainly affected three pathways, namely the 'biosynthesis of amino acids', 'starch and sucrose metabolism', and 'phenylpropanoid biosynthesis' pathways. 9 μM Se improved the synthesis, catabolism and utilization of amino acids and sugars, ultimately promoted plant growth. However, 24 μM Se up-regulated the related genes expression of PK, GPT, P5CS, SUS, SPS and CYP98A, and accumulated a large number of osmoregulation substances, such as citric acid, L-proline, D-sucrose and chlorogenic acid in the roots, ultimately affected the balance between plant growth and defense. In conclusion, this study reveals new insights into the key metabolic pathway in apple plants responses to Se.

TLDR

New insights are revealed into the key metabolic pathway in apple plants responses to Se, which mainly affected three pathways, namely the 'biosynthesis of amino acids', 'starch and sucrose metabolism', and 'phenylpropanoid biosynthesis' pathways.

Pregnancy-responsive pools of adult neural stem cells for transient neurogenesis in mothers

  • Chaker et al. 2023

  • Science

  • 2023

Adult neural stem cells (NSCs) contribute to lifelong brain plasticity. In the adult mouse ventricular-subventricular zone, NSCs are heterogeneous and, depending on their location in the niche, give rise to different subtypes of olfactory bulb (OB) interneurons. Here, we show that multiple regionally distinct NSCs, including domains that are usually quiescent, are recruited on different gestation days during pregnancy. Synchronized activation of these adult NSC pools generates transient waves of short-lived OB interneurons, especially in layers with less neurogenesis under homeostasis. Using spatial transcriptomics, we identified molecular markers of pregnancy-associated interneurons and showed that some subsets are temporarily needed for own pup recognition. Thus, pregnancy triggers transient yet behaviorally relevant neurogenesis, highlighting the physiological relevance of adult stem cell heterogeneity. Description Editor’s summary Different subtypes of new neurons are born in the adult brain ventricular-subventricular zone from spatially distinct pools of neural stem cells (NSCs). However, the physiological relevance of NSC diversity and specificity is unclear. Chaker et al. have revealed that during mouse pregnancy, multiple NSC pools are activated in mothers and generate specific olfactory bulb interneurons that function around birth to modulate aspects of maternal care, including own-pup recognition, and then disappear as pups mature (see the Perspective by Kempermann). These results highlight how adult NSC heterogeneity might provide a substrate for adaptive brain plasticity in response to different physiological states. —Mattia Maroso Dynamic response of adult neural stem cells during pregnancy prepares the maternal olfactory bulb for motherhood.

TLDR

It is shown that multiple regionally distinct NSCs, including domains that are usually quiescent, are recruited on different gestation days during pregnancy, highlighting the physiological relevance of adult stem cell heterogeneity.

Mitochondrial fission drives neuronal metabolic burden to promote stress susceptibility in male mice.

  • Dong et al. 2023

  • Nature metabolism

  • 2023

Neurons are particularly susceptible to energy fluctuations in response to stress. Mitochondrial fission is highly regulated to generate ATP via oxidative phosphorylation; however, the role of a regulator of mitochondrial fission in neuronal energy metabolism and synaptic efficacy under chronic stress remains elusive. Here, we show that chronic stress promotes mitochondrial fission in the medial prefrontal cortex via activating dynamin-related protein 1 (Drp1), resulting in mitochondrial dysfunction in male mice. Both pharmacological inhibition and genetic reduction of Drp1 ameliorates the deficit of excitatory synaptic transmission and stress-related depressive-like behavior. In addition, enhancing Drp1 fission promotes stress susceptibility, which is alleviated by coenzyme Q10, which potentiates mitochondrial ATP production. Together, our findings unmask the role of Drp1-dependent mitochondrial fission in the deficits of neuronal metabolic burden and depressive-like behavior and provides medication basis for metabolism-related emotional disorders.

TLDR

It is shown that chronic stress promotes mitochondrial fission in the medial prefrontal cortex via activating dynamin-related protein 1 (Drp1), resulting in mitochondrial dysfunction in male mice, and pharmacological inhibition and genetic reduction of Drp1 ameliorates the deficit of excitatory synaptic transmission and stress-related depressive-like behavior.

Uncovering the functional diversity of rare CRISPR-Cas systems with deep terascale clustering

  • Altae-Tran et al. 2023

  • Science

  • 2023

Microbial systems underpin many biotechnologies, including CRISPR, but the exponential growth of sequence databases makes it difficult to find previously unidentified systems. In this work, we develop the fast locality-sensitive hashing–based clustering (FLSHclust) algorithm, which performs deep clustering on massive datasets in linearithmic time. We incorporated FLSHclust into a CRISPR discovery pipeline and identified 188 previously unreported CRISPR-linked gene modules, revealing many additional biochemical functions coupled to adaptive immunity. We experimentally characterized three HNH nuclease–containing CRISPR systems, including the first type IV system with a specified interference mechanism, and engineered them for genome editing. We also identified and characterized a candidate type VII system, which we show acts on RNA. This work opens new avenues for harnessing CRISPR and for the broader exploration of the vast functional diversity of microbial proteins. Description Editor’s summary Microbial biochemicals systems are incredibly diverse, and computational tools to analyze sequence data are essential in identifying new and valuable components for biotechnology development. Using an approach called deep terascale clustering, Altae-Tran et al. found more than 200 new functional systems linked to CRISPR, a technology editing DNA. Some of the discovered genes are linked to precise DNA-editing systems that may enable safer therapeutic genome editing. The authors also identified a CRISPR-Cas enzyme, Cas14, which cuts RNA precisely. These discoveries may help to further improve DNA- and RNA-editing technologies, with wide-ranging applications in medicine and biotechnology. —Di Jiang A clustering algorithm, FLSHclust, was developed and applied to discover 188 previously unreported CRISPR-linked gene modules. INTRODUCTION Systematic mining of sequencing databases is a powerful method for discovering protein families and functional systems. This approach has uncovered diverse CRISPR-Cas systems, which are microbial RNA–guided adaptive immune systems that have served as the basis of several molecular technologies, notably programmable genome editing. However, existing methods for sequence mining lag behind the exponentially growing databases that now contain billions of proteins, which restricts the discovery of rare protein families and associations. RATIONALE We sought to comprehensively enumerate CRISPR-linked gene modules in all existing publicly available sequencing data. Recently, several previously unknown biochemical activities have been linked to programmable nucleic acid recognition by CRISPR systems, including transposition and protease activity. We reasoned that many more diverse enzymatic activities may be associated with CRISPR systems, many of which could be of low abundance in existing sequence databases. RESULTS We developed fast locality-sensitive hashing–based clustering (FLSHclust), a parallelized, deep clustering algorithm with linearithmic scaling based on locality-sensitive hashing. FLSHclust approaches MMseqs2, a gold-standard quadratic-scaling algorithm, in clustering performance. We applied FLSHclust in a sensitive CRISPR discovery pipeline and identified 188 previously unreported CRISPR-associated systems, including many rare systems. We experimentally characterized four of the newly discovered systems. We examined a type IV system with an HNH nuclease domain inserted in the CRISPR-associated DNA damage-inducible gene G (DinG)–like helicase. We found that this system exhibited RNA-guided protospacer-adjacent motif (PAM)–dependent directional double-stranded DNA (dsDNA) degradation, which required both the adenosine triphosphate (ATP) hydrolysis and HNH nuclease functions of the DinG-HNH protein. This is the first demonstration of a type IV system with a specified interference mechanism. We characterized two type I systems containing HNH nuclease domains inserted in different subunits of Cascade (Cas8-HNH and Cas5-HNH). We found that both of these systems performed precise dsDNA cleavage and single-stranded DNA (ssDNA) cleavage. We additionally observed collateral cleavage of ssDNA by the Cas5-HNH system. We demonstrated that both systems can be applied for genome editing in human cells and that the Cas8-HNH system is highly specific. We also studied candidate type VII systems, including a minimal Cas7-Cas5 effector complex and a distinctive interference protein including a β-CASP domain. We showed that these systems are likely derived from type III-E CRISPR systems and are RNA targeting. Other CRISPR-linked systems that we found include additional potential effector and adaptation components, two previously unknown associations of Mu transposons with CRISPR systems, and numerous newly identified proteins and domains associated with type V systems. We also identified an instance of potential co-option of a Cas9 as an anti-CRISPR mechanism and noted several non-CRISPR hypervariable regularly interspersed repeat arrays. CONCLUSION This study introduces FLSHclust as a tool to cluster millions of sequences quickly and efficiently, with broad applications in mining large sequence databases. The CRISPR-linked systems that we discovered represent an untapped trove of diverse biochemical activities linked to RNA-guided mechanisms, with great potential for development as biotechnologies. Identification and characterization of previously unreported CRISPR-Cas systems. (A) Schematic of FLSHclust algorithm. (B) Applications of protein clustering in CRISPR discovery. CARF, CRISPR-associated Rossmann fold. (C) Locus diagrams of three newly identified CRISPR-Cas systems experimentally characterized in this work. (D) Small RNA sequencing of candidate type VII Cas7-Cas5 ribonucleoprotein (RNP) (top), and targeted RNA cleavage by candidate type VII CRISPR-Cas system (bottom). DR, direct repeat; nt, nucleotide; bp, base pair; TBE, tris-boric acid–EDTA buffer.

TLDR

The fast locality-sensitive hashing–based clustering (FLSHclust) algorithm is developed, which performs deep clustering on massive datasets in linearithmic time, and uncovered more than 200 new functional systems linked to CRISPR, a technology editing DNA.

Metabolic exchanges are ubiquitous in natural microbial communities

  • Kost et al. 2023

  • Nature Microbiology

  • 2023

Microbial communities drive global biogeochemical cycles and shape the health of plants and animals—including humans. Their structure and function are determined by ecological and environmental interactions that govern the assembly, stability and evolution of microbial communities. A widely held view is that antagonistic interactions such as competition predominate in microbial communities and are ecologically more important than synergistic interactions—for example, mutualism or commensalism. Over the past decade, however, a more nuanced picture has emerged, wherein bacteria, archaea and fungi exist within interactive networks in which they exchange essential and non-essential metabolites. These metabolic interactions profoundly impact not only the physiology, ecology and evolution of the strains involved, but are also central to the functioning of many, if not all, microbiomes. Therefore, we advocate for a balanced view of microbiome ecology that encompasses both synergistic and antagonistic interactions as key forces driving the structure and dynamics within microbial communities. This Perspective discusses the prevalence of synergistic metabolic interactions in nature and their impact on microbial communities.

TLDR

This Perspective advocates for a balanced view of microbiome ecology that encompasses both synergistic and antagonistic interactions as key forces driving the structure and dynamics within microbial communities.

Host extracellular vesicles confer cytosolic access to systemic LPS licensing non-canonical inflammasome sensing and pyroptosis.

  • Kumari et al. 2023

  • Nature cell biology

  • 2023

Intracellular surveillance for systemic microbial components during homeostasis and infections governs host physiology and immunity. However, a long-standing question is how circulating microbial ligands become accessible to intracellular receptors. Here we show a role for host-derived extracellular vesicles (EVs) in this process; human and murine plasma-derived and cell culture-derived EVs have an intrinsic capacity to bind bacterial lipopolysaccharide (LPS). Remarkably, circulating host EVs capture blood-borne LPS in vivo, and the LPS-laden EVs confer cytosolic access for LPS, triggering non-canonical inflammasome activation of gasdermin D and pyroptosis. Mechanistically, the interaction between the lipid bilayer of EVs and the lipid A of LPS underlies EV capture of LPS, and the intracellular transfer of LPS by EVs is mediated by CD14. Overall, this study demonstrates that EVs capture and escort systemic LPS to the cytosol licensing inflammasome responses, uncovering EVs as a previously unrecognized link between systemic microbial ligands and intracellular surveillance. Kumari et al. show that host-derived extracellular vesicles capture systemic LPS and transfer it to the cytosol of immune cells via CD14-dependent endocytosis, triggering caspase-11-mediated gasdermin D activation and pyroptosis.

TLDR

This study demonstrates that EVs capture and escort systemic LPS to the cytosol licensing inflammasome responses, uncovering EVs as a previously unrecognized link between systemic microbial ligands and intracellular surveillance.

Integrated multi-omics reveals the roles of cecal microbiota and its derived bacterial consortium in promoting chicken growth

  • Zhang et al. 2023

  • mSystems

  • 2023

ABSTRACT Growing evidence has shown a close connection between gut microbiota and chicken growth performance; however, the crosstalk between microbes and chicken host remains elusive. Here, we integrated multi-omics approaches, fecal microbiota transplantation, and body weight-associated bacterial consortium to investigate the host-microbiota interactions in different body weight chickens. Microbial profiling analysis showed that uncultured Barnesiellaceae, Lactobacillus, Bacillus, Ruminococcaceae UCG-004, and Ruminococcaceae UCG-014 were highly enriched in the high body weight (HBW) chickens. The combination of Lactobacillus and Bacillus had 95.1% accuracy in discriminating HBW from low body weight chickens. Lipids and lipid-like molecules were found to be more abundant in the HBW chickens, and the differentially expressed cecal genes were enriched in the peroxisome proliferator-activated receptor (PPAR) signaling pathway and calcium signaling pathway. Correlations among the weight-associated genera, gut content metabolites, and gut gene expression were established, and fecal microbiota transplantation from HBW microbiota to newly born chicks increased the chicken antioxidant status, gut sugar transport, and immunity. A total of 67 strains belonging to Lactobacillus and Bacillus were isolated from the HBW chickens by the targeted culturomics method. Among six pairwise combinations of four selected strains, the consortium consisting of Limosilactobacillus reuteri CML393 and Bacillus velezensis CML396 significantly improved the chicken growth performance and gut health and influenced the cecal microbiota, metabolites, and gene expression. Further in vitro and in silico analyses indicated that L. reuteri CML393 and B. velezensis CML396 were less competitive but more cooperative than other pairwise combinations tested. IMPORTANCE The improvement of chicken growth performance is one of the major concerns for the poultry industry. Gut microbes are increasingly evidenced to be associated with chicken physiology and metabolism, thereby influencing chicken growth and development. Here, through integrated multi-omics analyses, we showed that chickens from the same line differing in their body weight were very different in their gut microbiota structure and host-microbiota crosstalk; microbes in high body weight (HBW) chickens contributed to chicken growth by regulating the gut function and homeostasis. We also verified that a specific bacterial consortium consisting of isolates from the HBW chickens has the potential to be used as chicken growth promoters. These findings provide new insights into the potential links between gut microbiota and chicken phenotypes, shedding light on future manipulation of chicken gut microbiota to improve chicken growth performance. The improvement of chicken growth performance is one of the major concerns for the poultry industry. Gut microbes are increasingly evidenced to be associated with chicken physiology and metabolism, thereby influencing chicken growth and development. Here, through integrated multi-omics analyses, we showed that chickens from the same line differing in their body weight were very different in their gut microbiota structure and host-microbiota crosstalk; microbes in high body weight (HBW) chickens contributed to chicken growth by regulating the gut function and homeostasis. We also verified that a specific bacterial consortium consisting of isolates from the HBW chickens has the potential to be used as chicken growth promoters. These findings provide new insights into the potential links between gut microbiota and chicken phenotypes, shedding light on future manipulation of chicken gut microbiota to improve chicken growth performance.

TLDR

Integrated multi-omics analyses showed that chickens from the same line differing in their body weight were very different in their gut microbiota structure and host-microbiota crosstalk; microbes in high body weight (HBW) chickens contributed to chicken growth by regulating the gut function and homeostasis.

Design of a redox-proficient Escherichia coli for screening terpenoids and modifying cytochrome P450s

  • Lin et al. 2023

  • Nature Catalysis

  • 2023

High-value terpenoids are found in plants, animals and microbes, with applications spanning health to agriculture. However, moving their biosynthetic pathways to a new host is challenging when cytochrome P450 (CYP) enzymes are needed for function. Here we engineer Escherichia coli to facilitate discovery by introducing 31 recombinant genes that enhance precursor supply, combine electron transfer pathways and implement regulatory control. We successfully produce terpenoids from different classes and species. By screening 64 bacterial CYPs found in genomes near terpenoid cyclase genes, we identify 40 functional CYPs and combine them with 17 cyclases to create 1,088 pathways. Using a kaurene scaffold, we show that bacterial CYPs can substitute 16 of 44 modifications made by plants. This strain enables high-throughput exploration of terpenoids and their chemical diversification, with a high success rate and reliable titres. Terpenoids are natural products with high value in the chemical industry; however, their expression in different hosts is limited by the availability of cytochrome P450. Here the authors show the engineering of recombinant Escherichia coli that can easily produce terpenoids from different classes and species.

TLDR

The engineering of recombinant Escherichia coli is shown that can easily produce terpenoids from different classes and species, and enables high-throughput exploration of terpenoid exploration and their chemical diversification, with a high success rate and reliable titres.

DreamCreature: Crafting Photorealistic Virtual Creatures from Imagination

  • Ng et al. 2023

  • ArXiv

  • 2023

Recent text-to-image (T2I) generative models allow for high-quality synthesis following either text instructions or visual examples. Despite their capabilities, these models face limitations in creating new, detailed creatures within specific categories (e.g., virtual dog or bird species), which are valuable in digital asset creation and biodiversity analysis. To bridge this gap, we introduce a novel task, Virtual Creatures Generation: Given a set of unlabeled images of the target concepts (e.g., 200 bird species), we aim to train a T2I model capable of creating new, hybrid concepts within diverse backgrounds and contexts. We propose a new method called DreamCreature, which identifies and extracts the underlying sub-concepts (e.g., body parts of a specific species) in an unsupervised manner. The T2I thus adapts to generate novel concepts (e.g., new bird species) with faithful structures and photorealistic appearance by seamlessly and flexibly composing learned sub-concepts. To enhance sub-concept fidelity and disentanglement, we extend the textual inversion technique by incorporating an additional projector and tailored attention loss regularization. Extensive experiments on two fine-grained image benchmarks demonstrate the superiority of DreamCreature over prior methods in both qualitative and quantitative evaluation. Ultimately, the learned sub-concepts facilitate diverse creative applications, including innovative consumer product designs and nuanced property modifications.

TLDR

This work proposes a new method called DreamCreature, which identifies and extracts the underlying sub-concepts in an unsupervised manner, and adapts to generate novel concepts with faithful structures and photorealistic appearance by seamlessly and flexibly composing learned sub- Concepts.

Lactate regulates major zygotic genome activation by H3K18 lactylation in mammals

  • Li et al. 2023

  • National Science Review

  • 2023

Lactate is present at a high level in the microenvironment of mammalian preimplantation embryos in vivo and in vitro. However, its role in preimplantation development is unclear. Here, we report that lactate is highly enriched in the nuclei of early embryos when major zygotic genome activation (ZGA) occurs in humans and mice. The inhibition of its production and uptake results in developmental arrest at the 2-cell stage, major ZGA failure, and loss of lactate-derived H3K18lac, which could be rescued by addition of Lac-CoA and recapitulated by overexpression of H3K18R mutation. By profiling the landscape of H3K18lac during mouse preimplantation development, we show that H3K18lac is enriched on the promoter regions of most major ZGA genes and correlates with their expressions. In humans, H3K18lac is also enriched in ZGA markers and temporally concomitant with their expressions. Taken together, we profile the landscapes of H3K18lac in mouse and human preimplantation embryos, and demonstrate the important role for H3K18lac in major ZGA, showing a conserved metabolic mechanism underlies preimplantation development of mammalian embryos.

TLDR

It is reported that lactate is highly enriched in the nuclei of early embryos when major zygotic genome activation occurs in humans and mice, and the important role for H3K18lac in major ZGA is demonstrated, showing a conserved metabolic mechanism underlies preimplantation development of mammalian embryos.

Microautophagy regulated by STK38 and GABARAPs is essential to repair lysosomes and prevent aging.

  • Ogura et al. 2023

  • EMBO reports

  • 2023

Lysosomes are degradative organelles and signaling hubs that maintain cell and tissue homeostasis, and lysosomal dysfunction is implicated in aging and reduced longevity. Lysosomes are frequently damaged, but their repair mechanisms remain unclear. Here, we demonstrate that damaged lysosomal membranes are repaired by microautophagy (a process termed “microlysophagy”) and identify key regulators of the first and last steps. We reveal the AGC kinase STK38 as a novel microlysophagy regulator. Through phosphorylation of the scaffold protein DOK1, STK38 is specifically required for the lysosomal recruitment of the AAA+ ATPase VPS4, which terminates microlysophagy by promoting the disassembly of ESCRT components. By contrast, microlysophagy initiation involves non‐canonical lipidation of ATG8s, especially the GABARAP subfamily, which is required for ESCRT assembly through interaction with ALIX. Depletion of STK38 and GABARAPs accelerates DNA damage‐induced cellular senescence in human cells and curtails lifespan in C. elegans, respectively. Thus, microlysophagy is regulated by STK38 and GABARAPs and could be essential for maintaining lysosomal integrity and preventing aging.

TLDR

It is demonstrated that damaged lysosomal membranes are repaired by microautophagy (a process termed “microlysophagy”) and identified key regulators of the first and last steps, which reveal the AGC kinase STK38 as a novel microlysophile regulator.

Microbes in porous environments: From active interactions to emergent feedback

  • Jin et al. 2023

  • 2023

Microbes thrive in diverse porous environments -- from soil and riverbeds to human lungs and cancer tissues -- spanning multiple scales and conditions. Short- to long-term fluctuations in local factors induce spatio-temporal heterogeneities, often leading to physiologically stressful settings. How microbes respond and adapt to such biophysical constraints is an active field of research where considerable insight has been gained over the last decade and a half. With a focus on bacteria, here we review recent advances in microbial self-organization and dispersal in inorganic and organic porous settings, highlighting the role of active interactions and feedback which mediate their survival and fitness. We conclude by discussing open questions and opportunities for leveraging integrative cross-disciplinary approaches to advance our understanding of the biophysical strategies that microbes employ -- at both species and community scales -- to make porous settings habitable. Active and responsive behaviour is key to microbial survival in porous environments, with far-reaching ramifications for developing strategies to mitigate anthropogenic impacts, innovate subsurface storage solutions, and predict future ecological scenarios imposed by current climatic changes.

TLDR

Recent advances in microbial self-organization and dispersal in inorganic and organic porous settings are reviewed, highlighting the role of active interactions and feedback which mediate their survival and fitness.

Knockout of the sugar transporter OsSTP15 enhances grain yield by improving tiller number due to increased sugar content in the shoot base of rice (Oryza sativa L.).

  • Li et al. 2023

  • The New phytologist

  • 2023

Sugar transporter proteins (STPs) play critical roles in regulating plant stress tolerance, growth, and development. However, the role of STPs in regulating crop yield is poorly understood. This study elucidates the mechanism by which knockout of the sugar transporter OsSTP15 enhances grain yield via increasing the tiller number in rice. We found that OsSTP15 is specifically expressed in the shoot base and vascular bundle sheath of seedlings and encodes a plasma membrane-localized high-affinity glucose efflux transporter. OsSTP15 knockout enhanced sucrose and trehalose-6-phosphate (Tre6P) synthesis in leaves and improved sucrose transport to the shoot base by inducing the expression of sucrose transporters. Higher glucose, sucrose, and Tre6P contents were observed at the shoot base of stp15 plants. Transcriptome and metabolome analyses of the shoot base demonstrated that OsSTP15 knockout upregulated the expression of cytokinin (CK) synthesis- and signaling pathway-related genes and increased CK levels. These findings suggest that OsSTP15 knockout represses glucose export from the cytoplasm and simultaneously enhances sugar transport from source leaves to the shoot base by promoting the synthesis of sucrose and Tre6P in leaves. Subsequent accumulation of glucose, sucrose, and Tre6P in the shoot base promotes tillering by stimulating the CK signaling pathway.

TLDR

The findings suggest that OsSTP15 knockout represses glucose export from the cytoplasm and simultaneously enhances sugar transport from source leaves to the shoot base by promoting the synthesis of sucrose and Tre6P in leaves.

Soil microbiome indicators can predict crop growth response to large-scale inoculation with arbuscular mycorrhizal fungi

  • Lutz et al. 2023

  • Nature Microbiology

  • 2023

Alternative solutions to mineral fertilizers and pesticides that reduce the environmental impact of agriculture are urgently needed. Arbuscular mycorrhizal fungi (AMF) can enhance plant nutrient uptake and reduce plant stress; yet, large-scale field inoculation trials with AMF are missing, and so far, results remain unpredictable. We conducted on-farm experiments in 54 fields in Switzerland and quantified the effects on maize growth. Growth response to AMF inoculation was highly variable, ranging from −12% to +40%. With few soil parameters and mainly soil microbiome indicators, we could successfully predict 86% of the variation in plant growth response to inoculation. The abundance of pathogenic fungi, rather than nutrient availability, best predicted (33%) AMF inoculation success. Our results indicate that soil microbiome indicators offer a sustainable biotechnological perspective to predict inoculation success at the beginning of the growing season. This predictability increases the profitability of microbiome engineering as a tool for sustainable agricultural management. On-farm experiments in 54 fields in Switzerland show that inoculation with arbuscular mycorrhizal fungi can promote crop yield, and inoculation success can be predicted using soil microbiome indicators.

TLDR

On-farm experiments in 54 fields in Switzerland show that inoculation with arbuscular mycorrhizal fungi can promote crop yield, and inoculation success can be predicted using soil microbiome indicators, and this predictability increases the profitability of microbiome engineering as a tool for sustainable agricultural management.

Computer Science

Sparks of Artificial General Intelligence: Early experiments with GPT-4

  • Bubeck et al. 2023

  • ArXiv

  • 2023

Artificial intelligence (AI) researchers have been developing and refining large language models (LLMs) that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. The latest model developed by OpenAI, GPT-4, was trained using an unprecedented scale of compute and data. In this paper, we report on our investigation of an early version of GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that exhibit more general intelligence than previous AI models. We discuss the rising capabilities and implications of these models. We demonstrate that, beyond its mastery of language, GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more, without needing any special prompting. Moreover, in all of these tasks, GPT-4's performance is strikingly close to human-level performance, and often vastly surpasses prior models such as ChatGPT. Given the breadth and depth of GPT-4's capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system. In our exploration of GPT-4, we put special emphasis on discovering its limitations, and we discuss the challenges ahead for advancing towards deeper and more comprehensive versions of AGI, including the possible need for pursuing a new paradigm that moves beyond next-word prediction. We conclude with reflections on societal influences of the recent technological leap and future research directions.

TLDR

It is argued that (this early version of) GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that exhibit more general intelligence than previous AI models, and the rising capabilities and implications of these models are discussed.

METAVERSE

  • A et al. 2023

  • International Journal of Innovative Research in Information Security

  • 2023

The Metaverse is a term that refers to a virtual universe where users can interact with each other and the environment through immersive technologies such as virtual reality (VR) and augmented reality (AR). It is a network of interconnected virtual spaces where users can participate in a range of activities such as gaming, socializing, shopping, and education. VR allows users to enter a completely simulated world, while AR overlays digital content onto the real world. These technologies enable users to experience a sense of presence in the Metaverse and interact with it as they would in the real world. As technology continues to advance, the Metaverse is becoming increasingly realistic and accessible. It has the potential to transform how we work, learn, and play, and create new opportunities for social and economic growth. However, it also raises ethical concerns around privacy, security, and the potential for addiction. Overall, the Metaverse represents an exciting and rapidly evolving field that has the potential to revolutionize how we interact with technology and each other. As such, it is an area of research and development that is likely to attract significant attention and investment in the coming years. As the Metaverse continues to evolve and expand, it will have a profound impact on how people interact with each other and with digital content. It will provide a new level of immersion and engagement, and it will transform the way we work, play, and learn.

TLDR

The Metaverse represents an exciting and rapidly evolving field that has the potential to revolutionize how the authors interact with technology and each other and is an area of research and development that is likely to attract significant attention and investment in the coming years.

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

  • Zheng et al. 2023

  • ArXiv

  • 2023

Evaluating large language model (LLM) based chat assistants is challenging due to their broad capabilities and the inadequacy of existing benchmarks in measuring human preferences. To address this, we explore using strong LLMs as judges to evaluate these models on more open-ended questions. We examine the usage and limitations of LLM-as-a-judge, including position, verbosity, and self-enhancement biases, as well as limited reasoning ability, and propose solutions to mitigate some of them. We then verify the agreement between LLM judges and human preferences by introducing two benchmarks: MT-bench, a multi-turn question set; and Chatbot Arena, a crowdsourced battle platform. Our results reveal that strong LLM judges like GPT-4 can match both controlled and crowdsourced human preferences well, achieving over 80% agreement, the same level of agreement between humans. Hence, LLM-as-a-judge is a scalable and explainable way to approximate human preferences, which are otherwise very expensive to obtain. Additionally, we show our benchmark and traditional benchmarks complement each other by evaluating several variants of LLaMA and Vicuna. The MT-bench questions, 3K expert votes, and 30K conversations with human preferences are publicly available at https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge.

TLDR

The results reveal that strong LLM judges like GPT-4 can match both controlled and crowdsourced human preferences well, achieving over 80% agreement, the same level of agreement between humans, and LLM-as-a-judge is a scalable and explainable way to approximate human preferences, which are otherwise very expensive to obtain.

PaLM 2 Technical Report

  • Anil et al. 2023

  • ArXiv

  • 2023

We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is a Transformer-based model trained using a mixture of objectives. Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM. This improved efficiency enables broader deployment while also allowing the model to respond faster, for a more natural pace of interaction. PaLM 2 demonstrates robust reasoning capabilities exemplified by large improvements over PaLM on BIG-Bench and other reasoning tasks. PaLM 2 exhibits stable performance on a suite of responsible AI evaluations, and enables inference-time control over toxicity without additional overhead or impact on other capabilities. Overall, PaLM 2 achieves state-of-the-art performance across a diverse set of tasks and capabilities. When discussing the PaLM 2 family, it is important to distinguish between pre-trained models (of various sizes), fine-tuned variants of these models, and the user-facing products that use these models. In particular, user-facing products typically include additional pre- and post-processing steps. Additionally, the underlying models may evolve over time. Therefore, one should not expect the performance of user-facing products to exactly match the results reported in this report.

TLDR

PaLM 2 is a new state-of-the-art language model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM and enables inference-time control over toxicity without additional overhead or impact on other capabilities.

Holistic Evaluation of Language Models

  • Liang et al. 2023

  • Annals of the New York Academy of Sciences

  • 2023

Language models (LMs) like GPT‐3, PaLM, and ChatGPT are the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation of Language Models (HELM) to improve the transparency of LMs. LMs can serve many purposes and their behavior should satisfy many desiderata. To navigate the vast space of potential scenarios and metrics, we taxonomize the space and select representative subsets. We evaluate models on 16 core scenarios and 7 metrics, exposing important trade‐offs. We supplement our core evaluation with seven targeted evaluations to deeply analyze specific aspects (including world knowledge, reasoning, regurgitation of copyrighted content, and generation of disinformation). We benchmark 30 LMs, from OpenAI, Microsoft, Google, Meta, Cohere, AI21 Labs, and others. Prior to HELM, models were evaluated on just 17.9% of the core HELM scenarios, with some prominent models not sharing a single scenario in common. We improve this to 96.0%: all 30 models are now benchmarked under the same standardized conditions. Our evaluation surfaces 25 top‐level findings. For full transparency, we release all raw model prompts and completions publicly. HELM is a living benchmark for the community, continuously updated with new scenarios, metrics, and models https://crfm.stanford.edu/helm/latest/.

TLDR

HELM is a living benchmark for the community, continuously updated with new scenarios, metrics, and models https://crfm.stanford.edu/helm/latest/.

Don't Make Your LLM an Evaluation Benchmark Cheater

  • Zhou et al. 2023

  • ArXiv

  • 2023

Large language models~(LLMs) have greatly advanced the frontiers of artificial intelligence, attaining remarkable improvement in model capacity. To assess the model performance, a typical approach is to construct evaluation benchmarks for measuring the ability level of LLMs in different aspects. Despite that a number of high-quality benchmarks have been released, the concerns about the appropriate use of these benchmarks and the fair comparison of different models are increasingly growing. Considering these concerns, in this paper, we discuss the potential risk and impact of inappropriately using evaluation benchmarks and misleadingly interpreting the evaluation results. Specially, we focus on a special issue that would lead to inappropriate evaluation, \ie \emph{benchmark leakage}, referring that the data related to evaluation sets is occasionally used for model training. This phenomenon now becomes more common since pre-training data is often prepared ahead of model test. We conduct extensive experiments to study the effect of benchmark leverage, and find that it can dramatically boost the evaluation results, which would finally lead to an unreliable assessment of model performance. To improve the use of existing evaluation benchmarks, we finally present several guidelines for both LLM developers and benchmark maintainers. We hope this work can draw attention to appropriate training and evaluation of LLMs.

TLDR

This paper discusses the potential risk and impact of inappropriately using evaluation benchmarks and misleadingly interpreting the evaluation results, and presents several guidelines for both LLM developers and benchmark maintainers to improve the use of existing evaluation benchmarks.

Medicine

Development and Validation of a New Substance Use Disorder Screening Test Based on the Diagnostic and Statistical Manual of Mental Disorders (DSM-5)

  • Saengduenchai et al. 2023

  • Journal of Health Science and Medical Research

  • 2023

Objective: The lack of a screening tool for substance use disorders is a significant problem for health care workers for patient care and referral. This study aimed to develop a Substance Use Disorder Screening Test (SUDST) to enable accurate classification of the severity of substance use disorders based on the Fifth Edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), that can be used by health professionals in all settings for basic screening for individuals at risk of substance use disorder. Material and Methods: Following close study of the DSM-5, 11 questions were developed, which were then tested on 207 participants who were receiving treatment for the use of methamphetamines. The participants were interviewed with the SUDST, the Ministry of Public Health Version 2 (normally ‘V.2’) screening test for risk of substance use, and were clinically diagnosed by their attending psychiatrist. Results: The Cronbach’s alpha coefficient for SUDST was 0.79. The scores obtained from the SUDST were in high agreement with the V.2 and clinical diagnosis (p-value<0.001). Factor analysis showed three components of substance use disorder: 1) preoccupation and loss of control, 2) risky/harmful use, and 3) biopsychosocial aspects. Of the total possible score of 11, the cut-off points for identifying severe, moderate, and mild levels of risk were ≥7, 5-6, and 3-4, respectively, with sensitivity=72.7%-96.5% and specificity=61.9%-88.7%. Conclusion: The SUDST had high reliability and validity and could be used to detect patients at risk of substance use disorders.

TLDR

The Substance Use Disorder Screening Test (SUDST) had high reliability and validity and could be used to detect patients at risk of substance use disorders.

ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns

  • Sallam et al. 2023

  • Healthcare

  • 2023

ChatGPT is an artificial intelligence (AI)-based conversational large language model (LLM). The potential applications of LLMs in health care education, research, and practice could be promising if the associated valid concerns are proactively examined and addressed. The current systematic review aimed to investigate the utility of ChatGPT in health care education, research, and practice and to highlight its potential limitations. Using the PRIMSA guidelines, a systematic search was conducted to retrieve English records in PubMed/MEDLINE and Google Scholar (published research or preprints) that examined ChatGPT in the context of health care education, research, or practice. A total of 60 records were eligible for inclusion. Benefits of ChatGPT were cited in 51/60 (85.0%) records and included: (1) improved scientific writing and enhancing research equity and versatility; (2) utility in health care research (efficient analysis of datasets, code generation, literature reviews, saving time to focus on experimental design, and drug discovery and development); (3) benefits in health care practice (streamlining the workflow, cost saving, documentation, personalized medicine, and improved health literacy); and (4) benefits in health care education including improved personalized learning and the focus on critical thinking and problem-based learning. Concerns regarding ChatGPT use were stated in 58/60 (96.7%) records including ethical, copyright, transparency, and legal issues, the risk of bias, plagiarism, lack of originality, inaccurate content with risk of hallucination, limited knowledge, incorrect citations, cybersecurity issues, and risk of infodemics. The promising applications of ChatGPT can induce paradigm shifts in health care education, research, and practice. However, the embrace of this AI chatbot should be conducted with extreme caution considering its potential limitations. As it currently stands, ChatGPT does not qualify to be listed as an author in scientific articles unless the ICMJE/COPE guidelines are revised or amended. An initiative involving all stakeholders in health care education, research, and practice is urgently needed. This will help to set a code of ethics to guide the responsible use of ChatGPT among other LLMs in health care and academia.

TLDR

The promising applications of ChatGPT can induce paradigm shifts in health care education, research, and practice, however, the embrace of this AI chatbot should be conducted with extreme caution considering its potential limitations.

How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment

  • Gilson et al. 2023

  • JMIR medical education

  • 2023

Background Chat Generative Pre-trained Transformer (ChatGPT) is a 175-billion-parameter natural language processing model that can generate conversation-style responses to user input. Objective This study aimed to evaluate the performance of ChatGPT on questions within the scope of the United States Medical Licensing Examination Step 1 and Step 2 exams, as well as to analyze responses for user interpretability. Methods We used 2 sets of multiple-choice questions to evaluate ChatGPT’s performance, each with questions pertaining to Step 1 and Step 2. The first set was derived from AMBOSS, a commonly used question bank for medical students, which also provides statistics on question difficulty and the performance on an exam relative to the user base. The second set was the National Board of Medical Examiners (NBME) free 120 questions. ChatGPT’s performance was compared to 2 other large language models, GPT-3 and InstructGPT. The text output of each ChatGPT response was evaluated across 3 qualitative metrics: logical justification of the answer selected, presence of information internal to the question, and presence of information external to the question. Results Of the 4 data sets, AMBOSS-Step1, AMBOSS-Step2, NBME-Free-Step1, and NBME-Free-Step2, ChatGPT achieved accuracies of 44% (44/100), 42% (42/100), 64.4% (56/87), and 57.8% (59/102), respectively. ChatGPT outperformed InstructGPT by 8.15% on average across all data sets, and GPT-3 performed similarly to random chance. The model demonstrated a significant decrease in performance as question difficulty increased (P=.01) within the AMBOSS-Step1 data set. We found that logical justification for ChatGPT’s answer selection was present in 100% of outputs of the NBME data sets. Internal information to the question was present in 96.8% (183/189) of all questions. The presence of information external to the question was 44.5% and 27% lower for incorrect answers relative to correct answers on the NBME-Free-Step1 (P<.001) and NBME-Free-Step2 (P=.001) data sets, respectively. Conclusions ChatGPT marks a significant improvement in natural language processing models on the tasks of medical question answering. By performing at a greater than 60% threshold on the NBME-Free-Step-1 data set, we show that the model achieves the equivalent of a passing score for a third-year medical student. Additionally, we highlight ChatGPT’s capacity to provide logic and informational context across the majority of answers. These facts taken together make a compelling case for the potential applications of ChatGPT as an interactive medical education tool to support learning.

TLDR

ChatGPT marks a significant improvement in natural language processing models on the tasks of medical question answering and highlights ChatGPT’s capacity to provide logic and informational context across the majority of answers.

Pregnancy-responsive pools of adult neural stem cells for transient neurogenesis in mothers

  • Chaker et al. 2023

  • Science

  • 2023

Adult neural stem cells (NSCs) contribute to lifelong brain plasticity. In the adult mouse ventricular-subventricular zone, NSCs are heterogeneous and, depending on their location in the niche, give rise to different subtypes of olfactory bulb (OB) interneurons. Here, we show that multiple regionally distinct NSCs, including domains that are usually quiescent, are recruited on different gestation days during pregnancy. Synchronized activation of these adult NSC pools generates transient waves of short-lived OB interneurons, especially in layers with less neurogenesis under homeostasis. Using spatial transcriptomics, we identified molecular markers of pregnancy-associated interneurons and showed that some subsets are temporarily needed for own pup recognition. Thus, pregnancy triggers transient yet behaviorally relevant neurogenesis, highlighting the physiological relevance of adult stem cell heterogeneity. Description Editor’s summary Different subtypes of new neurons are born in the adult brain ventricular-subventricular zone from spatially distinct pools of neural stem cells (NSCs). However, the physiological relevance of NSC diversity and specificity is unclear. Chaker et al. have revealed that during mouse pregnancy, multiple NSC pools are activated in mothers and generate specific olfactory bulb interneurons that function around birth to modulate aspects of maternal care, including own-pup recognition, and then disappear as pups mature (see the Perspective by Kempermann). These results highlight how adult NSC heterogeneity might provide a substrate for adaptive brain plasticity in response to different physiological states. —Mattia Maroso Dynamic response of adult neural stem cells during pregnancy prepares the maternal olfactory bulb for motherhood.

TLDR

It is shown that multiple regionally distinct NSCs, including domains that are usually quiescent, are recruited on different gestation days during pregnancy, highlighting the physiological relevance of adult stem cell heterogeneity.

Large-scale pancreatic cancer detection via non-contrast CT and deep learning.

  • Cao et al. 2023

  • Nature medicine

  • 2023

Pancreatic ductal adenocarcinoma (PDAC), the most deadly solid malignancy, is typically detected late and at an inoperable stage. Early or incidental detection is associated with prolonged survival, but screening asymptomatic individuals for PDAC using a single test remains unfeasible due to the low prevalence and potential harms of false positives. Non-contrast computed tomography (CT), routinely performed for clinical indications, offers the potential for large-scale screening, however, identification of PDAC using non-contrast CT has long been considered impossible. Here, we develop a deep learning approach, pancreatic cancer detection with artificial intelligence (PANDA), that can detect and classify pancreatic lesions with high accuracy via non-contrast CT. PANDA is trained on a dataset of 3,208 patients from a single center. PANDA achieves an area under the receiver operating characteristic curve (AUC) of 0.986–0.996 for lesion detection in a multicenter validation involving 6,239 patients across 10 centers, outperforms the mean radiologist performance by 34.1% in sensitivity and 6.3% in specificity for PDAC identification, and achieves a sensitivity of 92.9% and specificity of 99.9% for lesion detection in a real-world multi-scenario validation consisting of 20,530 consecutive patients. Notably, PANDA utilized with non-contrast CT shows non-inferiority to radiology reports (using contrast-enhanced CT) in the differentiation of common pancreatic lesion subtypes. PANDA could potentially serve as a new tool for large-scale pancreatic cancer screening.

TLDR

A deep learning approach, pancreatic cancer detection with artificial intelligence (PANDA), that can detect and classify pancreatic lesions with high accuracy via non-contrast CT and shows non-inferiority to radiology reports (using contrast-enhanced CT) in the differentiation of common pancreatic lesion subtypes.

Mitochondrial fission drives neuronal metabolic burden to promote stress susceptibility in male mice.

  • Dong et al. 2023

  • Nature metabolism

  • 2023

Neurons are particularly susceptible to energy fluctuations in response to stress. Mitochondrial fission is highly regulated to generate ATP via oxidative phosphorylation; however, the role of a regulator of mitochondrial fission in neuronal energy metabolism and synaptic efficacy under chronic stress remains elusive. Here, we show that chronic stress promotes mitochondrial fission in the medial prefrontal cortex via activating dynamin-related protein 1 (Drp1), resulting in mitochondrial dysfunction in male mice. Both pharmacological inhibition and genetic reduction of Drp1 ameliorates the deficit of excitatory synaptic transmission and stress-related depressive-like behavior. In addition, enhancing Drp1 fission promotes stress susceptibility, which is alleviated by coenzyme Q10, which potentiates mitochondrial ATP production. Together, our findings unmask the role of Drp1-dependent mitochondrial fission in the deficits of neuronal metabolic burden and depressive-like behavior and provides medication basis for metabolism-related emotional disorders.

TLDR

It is shown that chronic stress promotes mitochondrial fission in the medial prefrontal cortex via activating dynamin-related protein 1 (Drp1), resulting in mitochondrial dysfunction in male mice, and pharmacological inhibition and genetic reduction of Drp1 ameliorates the deficit of excitatory synaptic transmission and stress-related depressive-like behavior.

The activator protein-1 complex governs a vascular degenerative transcriptional programme in smooth muscle cells to trigger aortic dissection and rupture.

  • Luo et al. 2023

  • European heart journal

  • 2023

BACKGROUND AND AIMS Stanford type A aortic dissection (AD) is a degenerative aortic remodelling disease marked by an exceedingly high mortality without effective pharmacologic therapies. Smooth muscle cells (SMCs) lining tunica media adopt a range of states, and their transformation from contractile to synthetic phenotypes fundamentally triggers AD. However, the underlying pathomechanisms governing this population shift and subsequent AD, particularly at distinct disease temporal stages, remain elusive. METHODS Ascending aortas from nine patients undergoing ascending aorta replacement and five individuals undergoing heart transplantation were subjected to single-cell RNA sequencing. The pathogenic targets governing the phenotypic switch of SMCs were identified by trajectory inference, functional scoring, single-cell regulatory network inference and clustering, regulon, and interactome analyses and confirmed using human ascending aortas, primary SMCs, and a β-aminopropionitrile monofumarate-induced AD model. RESULTS The transcriptional profiles of 93 397 cells revealed a dynamic temporal-specific phenotypic transition and marked elevation of the activator protein-1 (AP-1) complex, actively enabling synthetic SMC expansion. Mechanistically, tumour necrosis factor signalling enhanced AP-1 transcriptional activity by dampening mitochondrial oxidative phosphorylation (OXPHOS). Targeting this axis with the OXPHOS enhancer coenzyme Q10 or AP-1-specific inhibitor T-5224 impedes phenotypic transition and aortic degeneration while improving survival by 42.88% (58.3%-83.3% for coenzyme Q10 treatment), 150.15% (33.3%-83.3% for 2-week T-5224), and 175.38% (33.3%-91.7% for 3-week T-5224) in the β-aminopropionitrile monofumarate-induced AD model. CONCLUSIONS This cross-sectional compendium of cellular atlas of human ascending aortas during AD progression provides previously unappreciated insights into a transcriptional programme permitting aortic degeneration, highlighting a translational proof of concept for an anti-remodelling intervention as an attractive strategy to manage temporal-specific AD by modulating the tumour necrosis factor-OXPHOS-AP-1 axis.

TLDR

This cross-sectional compendium of cellular atlas of human ascending aortas during AD progression provides previously unappreciated insights into a transcriptional programme permitting aortic degeneration, highlighting a translational proof of concept for an anti-remodelling intervention as an attractive strategy to manage temporal-specific AD by modulating the tumour necrosis factor-OXPHOS-AP-1 axis.

Large language models encode clinical knowledge

  • Singhal et al. 2022

  • Nature

  • 2022

Large language models (LLMs) have demonstrated impressive capabilities, but the bar for clinical applications is high. Attempts to assess the clinical knowledge of models typically rely on automated evaluations based on limited benchmarks. Here, to address these limitations, we present MultiMedQA, a benchmark combining six existing medical question answering datasets spanning professional medicine, research and consumer queries and a new dataset of medical questions searched online, HealthSearchQA. We propose a human evaluation framework for model answers along multiple axes including factuality, comprehension, reasoning, possible harm and bias. In addition, we evaluate Pathways Language Model1 (PaLM, a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM2 on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA3, MedMCQA4, PubMedQA5 and Measuring Massive Multitask Language Understanding (MMLU) clinical topics6), including 67.6% accuracy on MedQA (US Medical Licensing Exam-style questions), surpassing the prior state of the art by more than 17%. However, human evaluation reveals key gaps. To resolve this, we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, knowledge recall and reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal limitations of today’s models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLMs for clinical applications. Med-PaLM, a state-of-the-art large language model for medicine, is introduced and evaluated across several medical question answering tasks, demonstrating the promise of these models in this domain.

TLDR

MultiMedQA, a benchmark combining six existing medical question answering datasets spanning professional medicine, research and consumer queries and a new dataset of medical questions searched online, is presented and Med-PaLM, a state-of-the-art large language model for medicine is introduced and evaluated, demonstrating the promise of these models in this domain.

ViLaM: A Vision-Language Model with Enhanced Visual Grounding and Generalization Capability

  • Yang et al. 2023

  • ArXiv

  • 2023

Vision-language models have revolutionized human-computer interaction and shown significant progress in multi-modal tasks. However, applying these models to complex visual tasks like medical image analysis remains challenging. In this study, we propose ViLaM, a unified Vision-Language transformer model that integrates instruction tuning predicated on a large language model. This approach enables us to optimally utilize the knowledge and reasoning capacities of large pre-trained language models for an array of tasks encompassing both language and vision. We employ frozen pre-trained encoders to encode and align both image and text features, enabling ViLaM to handle a variety of visual tasks following textual instructions. Besides, we've designed cycle training for referring expressions to address the need for high-quality, paired referring expression datasets for training large models in terms of both quantity and quality. We evaluated ViLaM's exceptional performance on public general datasets and further confirmed its generalizability on medical datasets. Importantly, we've observed the model's impressive zero-shot learning ability, indicating the potential future application of ViLaM in the medical field.

TLDR

ViLaM is proposed, a unified Vision-Language transformer model that integrates instruction tuning predicated on a large language model that enables ViLaM to optimally utilize the knowledge and reasoning capacities of large pre-trained language models for an array of tasks encompassing both language and vision.

Indigenous knowledge mobilization: reflection on context, content, and relationship

  • Hutchinson et al. 2023

  • AlterNative: An International Journal of Indigenous Peoples

  • 2023

First Nations, Inuit, and Métis, Indigenous peoples in Canada, have long experienced racism within health services resulting in a health service system that many Indigenous people in Canada do not want to access. Research informing Indigenous health services must consider how findings and analysis happen within the community, what information is shared, and how it improves access to health services. Information shared in Indigenous research methods was communicated at the end and throughout the project. Indigenous knowledge mobilization in Indigenous research methods requires researchers to receive knowledge from the community and research participants. Also, knowledge sharing and moving into practice happen continuously throughout the research process. These qualities of Indigenous knowledge mobilization facilitate increasing accessibility to health services through Indigenous knowledge identified in research. This article describes an Indigenous knowledge mobilization framework that may be adapted within Indigenous communities looking to make transparent how Indigenous knowledge is incorporated within health services.

TLDR

An Indigenous knowledge mobilization framework that may be adapted within Indigenous communities looking to make transparent how Indigenous knowledge is incorporated within health services is described.

Blood Pressure Effects of SGLT2 Inhibitors: Mechanisms and Clinical Evidence in Different Populations

  • Beal et al. 2023

  • Current Hypertension Reports

  • 2023

Sodium glucose transporter 2 inhibitors (SGLT2 inhibitors) are increasingly prescribed due to their considerable benefits on clinical outcomes in people with diabetes, heart failure, and chronic kidney disease (CKD). Hypertension is a common comorbidity in each of these disease states, increasing risk of cardiovascular morbidity and mortality. We herein review the effects of SGLT2 inhibitors on blood pressure in different populations, proposed mechanisms of action, and the contribution of blood pressure lowering to end-organ protection. A recognised effect of SGLT2 inhibitors in recent clinical trials is blood pressure lowering, with multiple postulated mechanisms. This advantageous effect was first identified in populations with type 2 diabetes mellitus, prior to expansion of these trials to broader cohorts. On our review, we identified that the blood pressure lowering effect of SGLT2 inhibitors appears to be a dose-independent class-effect, with a magnitude of effect comparable to that seen with a low dose hydrochlorothiazide. There is considerable evidence demonstrating that this effect is observed across populations including those with type 2 diabetes mellitus, chronic kidney disease, and resistant hypertension.

TLDR

It is identified that the blood pressure lowering effect of SGLT2 inhibitors appears to be a dose-independent class-effect, with a magnitude of effect comparable to that seen with a low dose hydrochlorothiazide.

Treatment of Fournier's gangrene with negative pressure wound therapy in the course of sepsis — Case report

  • Bes et al. 2023

  • International journal of surgery case reports

  • 2023

Introduction Fournier's Gangrene is a severe and rapidly progressing necrotic infection of the skin and fascia that can affect the external genitals, perineum, anus, and abdomen. It can extend to the abdominal cavity and result in necrosis of the soft tissue with a high mortality rate. This case gives a unique perspective on managing such a complicated infection in a smaller community hospital. Presentation of case This report describes a particularly challenging case of Fournier's Gangrene in a 34 year old male with multiple pre-existing comorbidities, including alcohol use disorder, chronic kidney disease, and hepatitis B. Development of gangrene was preceded by sepsis. The patient's treatment was based on intravenous antibiotic therapy and early surgical intervention with extensive resection of necrotic tissue, supported by Hyperbaric Oxygen Therapy (HBOT) and Negative Pressure Wound Therapy (NPWT). Discussion The majority of the patient's treatment was done at a local community hospital with remote coordination with the Hyperbaric Medicine Center where the patient was temporarily transferred to for HBOT. Multiple treatment modalities were employed in this case of Fournier's gangrene, including intravenous antibiotic therapy, necrosectomy, chronic wound care with septic dressings and tissue debridement, HBOT and NPWT. Interdisciplinary cooperation between different medical specialists was crucial in treatment. Conclusion The presented case shows that despite the large scale of difficulty and the complexity of treatment, it is possible to effectively manage Fournier's Gangrene in a local community hospital through interdisciplinary cooperation with specialized quaternary care centers. HBOT and NPWT proved to be useful treatment modalities.

TLDR

The presented case shows that despite the large scale of difficulty and the complexity of treatment, it is possible to effectively manage Fournier's Gangrene in a local community hospital through interdisciplinary cooperation with specialized quaternary care centers.

Forks in the road for CAR T and CAR NK cell cancer therapies

  • Dagher et al. 2023

  • Nature Immunology

  • 2023

The advent of chimeric antigen receptor (CAR) T cell therapy has resulted in unprecedented long-term clearance of relapse/refractory hematological malignancies in both pediatric and adult patients. However, severe toxicities, such as cytokine release syndrome and neurotoxicity, associated with CAR T cells affect therapeutic utility; and treatment efficacies for solid tumors are still not impressive. As a result, engineering strategies that modify other immune cell types, especially natural killer (NK) cells have arisen. Owing to both CAR-dependent and CAR-independent (innate immune-mediated) antitumor killing capacity, major histocompatibility complex-independent cytotoxicity, reduced risk of alloreactivity and lack of major CAR T cell toxicities, CAR NK cells constitute one of the promising next-generation CAR immune cells that are also amenable as ‘off-the-shelf’ therapeutics. In this Review, we compare CAR T and CAR NK cell therapies, with particular focus on immunological synapses, engineering strategies and challenges. Here the authors review CAR T cell engineering and immunotherapy for cancer and juxtapose state-of-the-art developments with CAR NK cells as part of our Cancer Immunology series of Reviews.

TLDR

This Review compares CAR T and CAR NK cell therapies, with particular focus on immunological synapses, engineering strategies and challenges, and juxtapose state-of-the-art developments with CAR NK cells.

Hematologic abnormalities after COVID-19 vaccination: A large Korean population-based cohort study

  • Choi et al. 2023

  • medRxiv

  • 2023

Adverse hematologic events have been reported after COVID-19 vaccination. The objective of this study was to investigate whether hematologic abnormalities develop after COVID-19 vaccination. Retrospective cohort analyses of data from the Korean National Health Insurance Service (KNHIS) database were conducted from July 2022 to August 2023. We randomly selected data of half of those living in Seoul City as of January 1, 2021 with their diagnostic records up to December 31, 2021. The included participants were vaccinated and nonvaccinated persons aged 20 years or older (n= 4,203,887). Hematologic abnormalities after COVID-19 vaccination were identified as nutritional anemia, hemolytic anemia, aplastic anemia, coagulation defects, and neutropenia using International Classification of Diseases, Tenth Revision codes after index date. Incidence rates of hematologic abnormalities in the vaccination group 3 months after vaccination were significantly higher than those in the nonvaccinated group: 14.79 vs. 9.59 (P<.001) for nutritional anemia, 7.83 vs. 5.00 (P<.001) for aplastic anemia, and 4.85 vs. 1.85 (P<.001) for coagulation defects. COVID-19 mRNA vaccine was associated with higher development of nutritional anemia (odds ratio [OR], 1.230 [95% CI, 1.129-1.339], P<.001) and aplastic anemia (OR, 1.242 [95% CI, 1.110-1.390], P<.001) than the viral vector vaccine. The risk of coagulation defects was increased (OR, 1.986 [95% CI, 1.523-2.589], P<.001) after vaccination, and there was no risk difference between mRNA vaccine and viral vector vaccine (OR, 1.075 [95% CI, 0.936-1.233], P=.306). In conclusions, COVID-19 vaccination increased the risk of hematologic abnormalities. When administering the COVID-19 vaccine, careful observation will be necessary after vaccination.

TLDR

COVID-19 vaccination increased the risk of hematologic abnormalities, and careful observation will be necessary after vaccination, when administering the CO VID-19 vaccine.

SAMv2: A Unified Framework for Learning Appearance, Semantic and Cross-Modality Anatomical Embeddings

  • Bai et al. 2023

  • ArXiv

  • 2023

Identifying anatomical structures (e.g., lesions or landmarks) in medical images plays a fundamental role in medical image analysis. As an exemplar-based landmark detection method, Self-supervised Anatomical eMbedding (SAM) learns a discriminative embedding for each voxel in the image and has shown promising results on various tasks. However, SAM still faces challenges in: (1) differentiating voxels with similar appearance but different semantic meanings (\textit{e.g.}, two adjacent structures without clear borders); (2) matching voxels with similar semantics but markedly different appearance (e.g., the same vessel before and after contrast injection); and (3) cross-modality matching (e.g., CT-MRI registration). To overcome these challenges, we propose SAMv2, which is a unified framework designed to learn appearance, semantic, and cross-modality anatomical embeddings. Specifically, SAMv2 incorporates three key innovations: (1) semantic embedding learning with prototypical contrastive loss; (2) a fixed-point-based matching strategy; and (3) an iterative approach for cross-modality embedding learning. We thoroughly evaluated SAMv2 across three tasks, including one-shot landmark detection, lesion tracking on longitudinal CT scans, and CT-MRI affine/rigid registration with varying field of view. Our results suggest that SAMv2 outperforms SAM and other state-of-the-art methods, offering a robust and versatile approach for landmark based medical image analysis tasks. Code and trained models are available at: https://github.com/alibaba-damo-academy/self-supervised-anatomical-embedding-v2

TLDR

Biomedical knowledge graph-enhanced prompt generation for large language models

  • Soman et al. 2023

  • ArXiv

  • 2023

Large Language Models (LLMs) have been driving progress in AI at an unprecedented rate, yet still face challenges in knowledge-intensive domains like biomedicine. Solutions such as pre-training and domain-specific fine-tuning add substantial computational overhead, and the latter require domain-expertise. External knowledge infusion is task-specific and requires model training. Here, we introduce a task-agnostic Knowledge Graph-based Retrieval Augmented Generation (KG-RAG) framework by leveraging the massive biomedical KG SPOKE with LLMs such as Llama-2-13b, GPT-3.5-Turbo and GPT-4, to generate meaningful biomedical text rooted in established knowledge. KG-RAG consistently enhanced the performance of LLMs across various prompt types, including one-hop and two-hop prompts, drug repurposing queries, biomedical true/false questions, and multiple-choice questions (MCQ). Notably, KG-RAG provides a remarkable 71% boost in the performance of the Llama-2 model on the challenging MCQ dataset, demonstrating the framework's capacity to empower open-source models with fewer parameters for domain-specific questions. Furthermore, KG-RAG enhanced the performance of proprietary GPT models, such as GPT-3.5 which exhibited improvement over GPT-4 in context utilization on MCQ data. Our approach was also able to address drug repurposing questions, returning meaningful repurposing suggestions. In summary, the proposed framework combines explicit and implicit knowledge of KG and LLM, respectively, in an optimized fashion, thus enhancing the adaptability of general-purpose LLMs to tackle domain-specific questions in a unified framework.

TLDR

A task-agnostic Knowledge Graph-based Retrieval Augmented Generation (KG-RAG) framework, which combines explicit and implicit knowledge of KG and LLM, respectively, in an optimized fashion, thus enhancing the adaptability of general-purpose LLMs to tackle domain-specific questions in a unified framework.

Transcutaneous Bilirubin Accuracy Before, During, and After Phototherapy: A Meta-Analysis.

  • Ten Kate et al. 2023

  • Pediatrics

  • 2023

CONTEXT Transcutaneous bilirubinometry (TcB) is used as a valid screening to identify neonates requiring measurement of total serum bilirubin (TSB) before phototherapy. Its use during and after phototherapy is not advised yet because of unknown reliability. OBJECTIVES To determine the agreement of TcB and TSB measurements before, during, and after phototherapy. DATA SOURCES PubMed Medline, Cochrane Library, and references of eligible studies were searched. STUDY SELECTION Prospective and retrospective cohort and cross-sectional studies reporting Bland-Altman statistics of paired TcB and TSB measurements in term and preterm newborns. DATA EXTRACTION Meta-analysis was performed using the Mantel-Haenszel weighted approach. The agreement between TcB and TSB in μmol/L was described by pooled mean differences (MDs) and limits of agreement (LoA). RESULTS Fifty-four studies were included. The pooled MD before phototherapy is 2.5 μmol/L (LoA -38.3 to 43.3). The pooled MD during phototherapy is -0.3 μmol/L (LoA -34.8 to 34.2) on covered skin and -28.6 μmol/L (LoA -105.7 to 48.5) on uncovered skin. The pooled MD after phototherapy is -34.3 μmol/L (LoA -86.7 to 18.1) on covered skin and -21.1 μmol/L (LoA -88.6 to 46.4) on uncovered skin. Subgroup analysis revealed the best agreement at the forehead. We did not find any difference in agreement between term and preterm neonates. LIMITATIONS Language restriction. CONCLUSIONS TcB measurements before and during phototherapy on covered skin show good agreement compared with TSB in term and preterm newborns. More studies are needed to evaluate the accuracy after phototherapy.

TLDR

TcB measurements before and during phototherapy on covered skin show good agreement compared with TSB in term and preterm newborns, and more studies are needed to evaluate the accuracy after phototherapy.

Performance of Multimodal GPT-4V on USMLE with Image: Potential for Imaging Diagnostic Support with Explanations

  • Zhichao Yang et al. 2023

  • medRxiv

  • 2023

Importance Using artificial intelligence (AI) to help clinical diagnoses has been an active research topic for more than six decades. Few research however has the scale and accuracy that can be turned into clinical practice. The tide may be turned today with the power of large language models (LLMs). In this application, we evaluated the accuracy of medical license exam using the newly released Generative Pre-trained Transformer 4 with vision (GPT-4V), a large multimodal model trained to analyze image inputs with the text instructions from the user. This study is the first to evaluate GPTs for interpreting medical images. Objective This study aimed to evaluate the performance of GPT-4V on medical licensing examination questions with images, as well as to analyze interpretability. Design, Setting, and Participants We used 3 sets of multiple-choice questions with images to evaluate GPT-4V performance. The first set was the United States Medical Licensing Examination (USMLE) from the National Board of Medical Examiners (NBME) sample questions in step1, step2CK, and step3. The second set was derived from AMBOSS, a commonly used question bank for medical students, which also provides statistics on question difficulty and the performance on an exam relative to the user base. The third set was the Diagnostic Radiology Qualifying Core Exam (DRQCE) from the American Board of Radiology. The study (including data analysis) was conducted from September to October 2023. Main Outcomes and Measures The choice accuracy of GPT-4V was compared to two other large language models, GPT-4 and ChatGPT. The GPT-4V explanation was evaluated across 4 qualitative metrics: image misunderstanding, text hallucination, reasoning error, and non-medical error. Results Of the 3 exams with images, NBME, AMBOSS, and DRQCE, GPT-4V achieved accuracies of 86.2%, 62.0%, and 73.1%, respectively. GPT-4V outperformed ChatGPT and GPT-4 by 131.8% and 64.5% on average across various data sets. The model demonstrated a decreasing trend in performance as question difficulty increased in the AMBOSS dataset. GPT-4V achieves an accuracy of 90.7% in the full USMLE exam, outperforming the passing threshold of about 60% accuracy. Among the incorrect answers, 75.9% of responses included misinterpretation of the image. However, 39.0% of them could be easily solved with a short hint. Conclusion In this cross-sectional study, GPT-4V achieved a high accuracy of USMLE that was in the 70th - 80th percentile with AMBOSS users preparing for the exam. The results suggest the potential of GPT-4V for clinical decision support. However, GPT-4V generated explanation revealed several issues. It needs to improve explanation quality for potential use in clinical decision support.

TLDR

This cross-sectional study evaluated the accuracy of medical license exam using the newly released Generative Pre-trained Transformer 4 with vision (GPT-4V), a large multimodal model trained to analyze image inputs with the text instructions from the user to suggest the potential of GPTs for clinical decision support.

Lactate regulates major zygotic genome activation by H3K18 lactylation in mammals

  • Li et al. 2023

  • National Science Review

  • 2023

Lactate is present at a high level in the microenvironment of mammalian preimplantation embryos in vivo and in vitro. However, its role in preimplantation development is unclear. Here, we report that lactate is highly enriched in the nuclei of early embryos when major zygotic genome activation (ZGA) occurs in humans and mice. The inhibition of its production and uptake results in developmental arrest at the 2-cell stage, major ZGA failure, and loss of lactate-derived H3K18lac, which could be rescued by addition of Lac-CoA and recapitulated by overexpression of H3K18R mutation. By profiling the landscape of H3K18lac during mouse preimplantation development, we show that H3K18lac is enriched on the promoter regions of most major ZGA genes and correlates with their expressions. In humans, H3K18lac is also enriched in ZGA markers and temporally concomitant with their expressions. Taken together, we profile the landscapes of H3K18lac in mouse and human preimplantation embryos, and demonstrate the important role for H3K18lac in major ZGA, showing a conserved metabolic mechanism underlies preimplantation development of mammalian embryos.

TLDR

It is reported that lactate is highly enriched in the nuclei of early embryos when major zygotic genome activation occurs in humans and mice, and the important role for H3K18lac in major ZGA is demonstrated, showing a conserved metabolic mechanism underlies preimplantation development of mammalian embryos.

Physics

Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting

  • Yang et al. 2023

  • ArXiv

  • 2023

Reconstructing dynamic 3D scenes from 2D images and generating diverse views over time is challenging due to scene complexity and temporal dynamics. Despite advancements in neural implicit models, limitations persist: (i) Inadequate Scene Structure: Existing methods struggle to reveal the spatial and temporal structure of dynamic scenes from directly learning the complex 6D plenoptic function. (ii) Scaling Deformation Modeling: Explicitly modeling scene element deformation becomes impractical for complex dynamics. To address these issues, we consider the spacetime as an entirety and propose to approximate the underlying spatio-temporal 4D volume of a dynamic scene by optimizing a collection of 4D primitives, with explicit geometry and appearance modeling. Learning to optimize the 4D primitives enables us to synthesize novel views at any desired time with our tailored rendering routine. Our model is conceptually simple, consisting of a 4D Gaussian parameterized by anisotropic ellipses that can rotate arbitrarily in space and time, as well as view-dependent and time-evolved appearance represented by the coefficient of 4D spherindrical harmonics. This approach offers simplicity, flexibility for variable-length video and end-to-end training, and efficient real-time rendering, making it suitable for capturing complex dynamic scene motions. Experiments across various benchmarks, including monocular and multi-view scenarios, demonstrate our 4DGS model's superior visual quality and efficiency.

TLDR

This work considers the spacetime as an entirety and proposes to approximate the underlying spatio-temporal 4D volume of a dynamic scene by optimizing a collection of 4D primitives, with explicit geometry and appearance modeling, making it suitable for capturing complex dynamic scene motions.

Screening strategy for developing thermoelectric interface materials

  • Xie et al. 2023

  • Science

  • 2023

Thermoelectric interface materials (TEiMs) are essential to the development of thermoelectric generators. Common TEiMs use pure metals or binary alloys but have performance stability issues. Conventional selection of TEiMs generally relies on trial-and-error experimentation. We developed a TEiM screening strategy that is based on phase diagram predictions by density functional theory calculations. By combining the phase diagram with electrical resistivity and melting points of potential reaction products, we discovered that the semimetal MgCuSb is a reliable TEiM for high-performance MgAgSb. The MgCuSb/MgAgSb junction exhibits low interfacial contact resistivity (ρc <1 microhm square centimeter) even after annealing at 553 kelvin for 16 days. The fabricated two-pair MgAgSb/Mg3.2Bi1.5Sb0.5 module demonstrated a high conversion efficiency of 9.25% under a 300 kelvin temperature gradient. We performed an international round-robin testing of module performance to confirm the measurement reliability. The strategy can be applied to other thermoelectric materials, filling a vital gap in the development of thermoelectric modules. Description Editor’s summary Thermoelectric modules convert waste heat into electricity, but finding materials that go in between the thermoelectric material and the electrodes is challenging because inappropriate interface materials can drive failure of the thermoelectric module. Xie et al. developed a screening strategy for isolating more chemically complex interface candidate materials (see the Perspective by Xu and Tian). Using this strategy, the authors identified a magnesium–copper–antimony semimetal that is an excellent interface material for a specific type of high-performance thermoelectric module. This approach should apply to a wide range of material chemistries. —Brent Grocholski A strategy to find thermoelectric interface materials shows promise for developing high-performance thermoelectric modules.

TLDR

Simultaneous Discovery of Quantum Error Correction Codes and Encoders with a Noise-Aware Reinforcement Learning Agent

  • Olle et al. 2023

  • 2023

Finding optimal ways to protect quantum states from noise remains an outstanding challenge across all quantum technologies, and quantum error correction (QEC) is the most promising strategy to address this issue. Constructing QEC codes is a complex task that has historically been powered by human creativity with the discovery of a large zoo of families of codes. However, in the context of real-world scenarios there are two challenges: these codes have typically been categorized only for their performance under an idealized noise model and the implementation-specific optimal encoding circuit is not known. In this work, we train a Deep Reinforcement Learning agent that automatically discovers both QEC codes and their encoding circuits for a given gate set, qubit connectivity, and error model. We introduce the concept of a noise-aware meta-agent, which learns to produce encoding strategies simultaneously for a range of noise models, thus leveraging transfer of insights between different situations. Moreover, thanks to the use of the stabilizer formalism and a vectorized Clifford simulator, our RL implementation is extremely efficient, allowing us to produce many codes and their encoders from scratch within seconds, with code distances varying from 3 to 5 and with up to 20 physical qubits. Our approach opens the door towards hardware-adapted accelerated discovery of QEC approaches across the full spectrum of quantum hardware platforms of interest.

TLDR

This work trains a Deep Reinforcement Learning agent that automatically discovers both QEC codes and their encoding circuits for a given gate set, qubit connectivity, and error model, and introduces the concept of a noise-aware meta-agent, which learns to produce encoding strategies simultaneously for a range of noise models.

Machine-Guided Discovery of a Real-World Rogue Wave Model

  • Häfner et al. 2023

  • Proceedings of the National Academy of Sciences of the United States of America

  • 2023

Big data and large-scale machine learning have had a profound impact on science and engineering, particularly in fields focused on forecasting and prediction. Yet, it is still not clear how we can use the superior pattern-matching abilities of machine learning models for scientific discovery. This is because the goals of machine learning and science are generally not aligned. In addition to being accurate, scientific theories must also be causally consistent with the underlying physical process and allow for human analysis, reasoning, and manipulation to advance the field. In this paper, we present a case study on discovering a symbolic model for oceanic rogue waves from data using causal analysis, deep learning, parsimony-guided model selection, and symbolic regression. We train an artificial neural network on causal features from an extensive dataset of observations from wave buoys, while selecting for predictive performance and causal invariance. We apply symbolic regression to distill this black-box model into a mathematical equation that retains the neural network's predictive capabilities, while allowing for interpretation in the context of existing wave theory. The resulting model reproduces known behavior, generates well-calibrated probabilities, and achieves better predictive scores on unseen data than current theory. This showcases how machine learning can facilitate inductive scientific discovery and paves the way for more accurate rogue wave forecasting.

TLDR

A case study on discovering a symbolic model for oceanic rogue waves from data using causal analysis, deep learning, parsimony-guided model selection, and symbolic regression to showcase how machine learning can facilitate inductive scientific discovery and paves the way for more accurate rogue wave forecasting.

Exactly Solvable Floquet Dynamics for Conformal Field Theories in Dimensions Greater than Two

  • Das et al. 2023

  • 2023

We find classes of driven conformal field theories (CFT) in d+1 dimensions with d>1, whose quench and floquet dynamics can be computed exactly. The setup is suitable for studying periodic drives, consisting of square pulse protocols for which Hamiltonian evolution takes place with different deformations of the original CFT Hamiltonian in successive time intervals. These deformations are realized by specific combinations of conformal generators with a deformation parameter $\beta$; the $\beta<1$ ($\beta>1$) Hamiltonians can be unitarily related to the standard (L\"uscher-Mack) CFT Hamiltonians. The resulting time evolution can be then calculated by performing appropriate conformal transformations. For d<= 3 we show that the transformations can be easily obtained in a quaternion formalism; we use this formalism to obtain exact expressions for the fidelity, unequal-time correlator, and the energy density for the driven system for d = 3. Our results for a single square pulse drive cycle reveal qualitatively different behaviors depending on the value of $\beta$, with exponential decays characteristic of heating for $\beta>1$, oscillations for $\beta<1$ and power law decays for $\beta = 1$. When the Hamiltonians in one cycle involve generators of a single SL(2, R) subalgebra we find fixed points or fixed surfaces of the corresponding transformations. Successive cycles lead to either convergence to one of the fixed points, or oscillations, depending on the conjugacy class. This indicates that the system can be in different dynamical phases as we vary the parameters of the drive protocol. We also point out that our results are expected to hold for a broader class of QFTs that possesses an SL(2,C) symmetry with fields that transform as quasi-primaries under this. As an example, we briefly comment on celestial CFTs in this context.

TLDR

Physics-informed neural networks for transformed geometries and manifolds

  • Burbulla et al. 2023

  • ArXiv

  • 2023

Physics-informed neural networks (PINNs) effectively embed physical principles into machine learning, but often struggle with complex or alternating geometries. We propose a novel method for integrating geometric transformations within PINNs to robustly accommodate geometric variations. Our method incorporates a diffeomorphism as a mapping of a reference domain and adapts the derivative computation of the physics-informed loss function. This generalizes the applicability of PINNs not only to smoothly deformed domains, but also to lower-dimensional manifolds and allows for direct shape optimization while training the network. We demonstrate the effectivity of our approach on several problems: (i) Eikonal equation on Archimedean spiral, (ii) Poisson problem on surface manifold, (iii) Incompressible Stokes flow in deformed tube, and (iv) Shape optimization with Laplace operator. Through these examples, we demonstrate the enhanced flexibility over traditional PINNs, especially under geometric variations. The proposed framework presents an outlook for training deep neural operators over parametrized geometries, paving the way for advanced modeling with PDEs on complex geometries in science and engineering.

TLDR

A novel method for integrating geometric transformations within PINNs to robustly accommodate geometric variations is proposed, which generalizes the applicability of PINNs not only to smoothly deformed domains, but also to lower-dimensional manifolds and allows for direct shape optimization while training the network.

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

  • Rein et al. 2023

  • ArXiv

  • 2023

We present GPQA, a challenging dataset of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry. We ensure that the questions are high-quality and extremely difficult: experts who have or are pursuing PhDs in the corresponding domains reach 65% accuracy (74% when discounting clear mistakes the experts identified in retrospect), while highly skilled non-expert validators only reach 34% accuracy, despite spending on average over 30 minutes with unrestricted access to the web (i.e., the questions are"Google-proof"). The questions are also difficult for state-of-the-art AI systems, with our strongest GPT-4 based baseline achieving 39% accuracy. If we are to use future AI systems to help us answer very hard questions, for example, when developing new scientific knowledge, we need to develop scalable oversight methods that enable humans to supervise their outputs, which may be difficult even if the supervisors are themselves skilled and knowledgeable. The difficulty of GPQA both for skilled non-experts and frontier AI systems should enable realistic scalable oversight experiments, which we hope can help devise ways for human experts to reliably get truthful information from AI systems that surpass human capabilities.

TLDR

GPQA, a challenging dataset of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry, is presented, which it is hoped can help devise ways for human experts to reliably get truthful information from AI systems that surpass human capabilities.

DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models

  • Wang et al. 2023

  • ArXiv

  • 2023

Nature evolves creatures with a high complexity of morphological and behavioral intelligence, meanwhile computational methods lag in approaching that diversity and efficacy. Co-optimization of artificial creatures' morphology and control in silico shows promise for applications in physical soft robotics and virtual character creation; such approaches, however, require developing new learning algorithms that can reason about function atop pure structure. In this paper, we present DiffuseBot, a physics-augmented diffusion model that generates soft robot morphologies capable of excelling in a wide spectrum of tasks. DiffuseBot bridges the gap between virtually generated content and physical utility by (i) augmenting the diffusion process with a physical dynamical simulation which provides a certificate of performance, and (ii) introducing a co-design procedure that jointly optimizes physical design and control by leveraging information about physical sensitivities from differentiable simulation. We showcase a range of simulated and fabricated robots along with their capabilities. Check our website at https://diffusebot.github.io/

TLDR

DiffuseBot is presented, a physics-augmented diffusion model that generates soft robot morphologies capable of excelling in a wide spectrum of tasks and introduces a co-design procedure that jointly optimizes physical design and control by leveraging information about physical sensitivities from differentiable simulation.

Local Convolution Enhanced Global Fourier Neural Operator For Multiscale Dynamic Spaces Prediction

  • Zhao et al. 2023

  • ArXiv

  • 2023

Neural operators extend the capabilities of traditional neural networks by allowing them to handle mappings between function spaces for the purpose of solving partial differential equations (PDEs). One of the most notable methods is the Fourier Neural Operator (FNO), which is inspired by Green's function method and approximate operator kernel directly in the frequency domain. In this work, we focus on predicting multiscale dynamic spaces, which is equivalent to solving multiscale PDEs. Multiscale PDEs are characterized by rapid coefficient changes and solution space oscillations, which are crucial for modeling atmospheric convection and ocean circulation. To solve this problem, models should have the ability to capture rapid changes and process them at various scales. However, the FNO only approximates kernels in the low-frequency domain, which is insufficient when solving multiscale PDEs. To address this challenge, we propose a novel hierarchical neural operator that integrates improved Fourier layers with attention mechanisms, aiming to capture all details and handle them at various scales. These mechanisms complement each other in the frequency domain and encourage the model to solve multiscale problems. We perform experiments on dynamic spaces governed by forward and reverse problems of multiscale elliptic equations, Navier-Stokes equations and some other physical scenarios, and reach superior performance in existing PDE benchmarks, especially equations characterized by rapid coefficient variations.

TLDR

This work proposes a novel hierarchical neural operator that integrates improved Fourier layers with attention mechanisms, aiming to capture all details and handle them at various scales, and reaches superior performance in existing PDE benchmarks, especially equations characterized by rapid coefficient variations.

Psychology

Exploring the Influence of Social Media Influencers’ (SMIs) Traits on Consumer Purchasing Behavior for Online Products on the TikTok Platform: The Mediating Effect of Trustworthiness

  • Abdullah et al. 2023

  • International Journal of Academic Research in Business and Social Sciences

  • 2023

In today’s digital age and with the surge of social media usage, influencer marketing has emerged as a popular tool for businesses to harness the power of social media in order to interact with their target market in a more credible and engaging manner. In using this tool, it is vital for businesses to carefully plan and select the right social media influencers for their brands. The notion of finding the right social media influencer prevail the need to understand the characteristics or traits of the social media influencer. Thus, this study aims to investigate how the social media influencer’s traits specifically credibility, authenticity, and expertise can influence the consumer purchasing behavior for online products and be mediated by the trustworthiness aspect. The study is a quantitative research, whereby a simple random sampling will be used for participants’ selection; the social media users who buy products using the TikTok platform. A self-administered online survey questionnaire with five-point Likert scale will be applied for data collecting purposes. Partial Least Square-Structural Equation Modelling (PLS-SEM) will be used to analyze questions on the SMIs traits: credibility, authenticity, and expertise on the consumer purchase behavior. This paper explores the traits or characteristics of social media influencers as well as the mediating role of trustworthiness in shaping the consumer buying behaviour. The implications for marketers include a better understanding of how consumers engage with social media influencers on social media. Results from this study may postulate a theoretical framework for measuring consumer buying behavior based on social media influencer traits.

TLDR

This study aims to investigate how the social media influencer’s traits specifically credibility, authenticity, and expertise can influence the consumer purchasing behavior for online products and be mediated by the trustworthiness aspect.

Improving Interpersonal Communication by Simulating Audiences with Language Models

  • Liu et al. 2023

  • ArXiv

  • 2023

How do we communicate with others to achieve our goals? We use our prior experience or advice from others, or construct a candidate utterance by predicting how it will be received. However, our experiences are limited and biased, and reasoning about potential outcomes can be difficult and cognitively challenging. In this paper, we explore how we can leverage Large Language Model (LLM) simulations to help us communicate better. We propose the Explore-Generate-Simulate (EGS) framework, which takes as input any scenario where an individual is communicating to an audience with a goal they want to achieve. EGS (1) explores the solution space by producing a diverse set of advice relevant to the scenario, (2) generates communication candidates conditioned on subsets of the advice, and (3) simulates the reactions from various audiences to determine both the best candidate and advice to use. We evaluate the framework on eight scenarios spanning the ten fundamental processes of interpersonal communication. For each scenario, we collect a dataset of human evaluations across candidates and baselines, and showcase that our framework's chosen candidate is preferred over popular generation mechanisms including Chain-of-Thought. We also find that audience simulations achieve reasonably high agreement with human raters across 5 of the 8 scenarios. Finally, we demonstrate the generality of our framework by applying it to real-world scenarios described by users on web forums. Through evaluations and demonstrations, we show that EGS enhances the effectiveness and outcomes of goal-oriented communication across a variety of situations, thus opening up new possibilities for the application of large language models in revolutionizing communication and decision-making processes.

TLDR

This paper proposes the Explore-Generate-Simulate (EGS) framework, a framework that enhances the effectiveness and outcomes of goal-oriented communication across a variety of situations, thus opening up new possibilities for the application of large language models in revolutionizing communication and decision-making processes.

Using large language models in psychology

  • Demszky et al. 2023

  • Nature Reviews Psychology

  • 2023

Large language models (LLMs), such as OpenAI’s GPT-4, Google’s Bard or Meta’s LLaMa, have created unprecedented opportunities for analysing and generating language data on a massive scale. Because language data have a central role in all areas of psychology, this new technology has the potential to transform the field. In this Perspective, we review the foundations of LLMs. We then explain how the way that LLMs are constructed enables them to effectively generate human-like linguistic output without the ability to think or feel like a human. We argue that although LLMs have the potential to advance psychological measurement, experimentation and practice, they are not yet ready for many of the most transformative psychological applications — but further research and development may enable such use. Next, we examine four major concerns about the application of LLMs to psychology, and how each might be overcome. Finally, we conclude with recommendations for investments that could help to address these concerns: field-initiated ‘keystone’ datasets; increased standardization of performance benchmarks; and shared computing and analysis infrastructure to ensure that the future of LLM-powered research is equitable. Large language models (LLMs), which can generate and score text in human-like ways, have the potential to advance psychological measurement, experimentation and practice. In this Perspective, Demszky and colleagues describe how LLMs work, concerns about using them for psychological purposes, and how these concerns might be addressed.

TLDR

Pavlovian Fear Conditioning Is More than You Think It Is

  • McDannald et al. 2023

  • The Journal of Neuroscience

  • 2023

A common neuroscience application of Pavlovian fear conditioning is to manipulate neuron-type activity, pair a cue with foot shock, then measure cue-elicited freezing in a novel context. If the manipulation reduces freezing, the neuron type is implicated in Pavlovian fear conditioning. This application reduces Pavlovian fear conditioning to a single concept. In this Viewpoint, I describe experiments supporting the view that Pavlovian fear conditioning refers to three distinct concepts: procedure, process, and behavior. An experimenter controls procedure, observes behavior, but infers process. Distinguishing these concepts is essential because: (1) a shock-paired cue can engage numerous processes and behaviors; (2) experimenter decisions about procedure influence the processes engaged and behaviors elicited; and (3) many processes are latent, imbuing the cue with properties that only manifest outside of the original conditioning setting. This means we could understand the complete neural basis of freezing, yet know little about the neural basis of fear. Neuroscientists can choose to use a variety of procedures to study a diversity of processes and behaviors. Manipulating neuron-type activity in multiple procedures can reveal specific, general, or complex neuron-type contributions to cue-elicited processes and behaviors. The results will be a broader and more detailed neural basis of fear with greater relevance to the spectrum of symptoms defining anxiety and stressor-related disorders.

TLDR

Association of executive function with suicidality based on resting-state functional connectivity in young adults with subthreshold depression

  • Yun et al. 2023

  • Scientific reports

  • 2023

Subthreshold depression (StD) is associated an increased risk of developing major depressive disorder (MDD) and suicidality. Suicidality could be linked to distress intolerance and use of context-dependent strategies. We identified neural correlates of executive functioning among the hubs in the resting-state functional connectome (rs-FCN) and examined associations with recent suicidality in StD and MDD. In total, 79 young adults [27 StD, 30 MDD, and 23 healthy controls (HC)] were scanned using magnetic resonance imaging. Neurocognitive measures of the mean latency to correct five moves in the One Touch Stockings of Cambridge (OTSMLC5), spatial working memory between errors (SWMBE), rapid visual information processing A′ (RVPA′), and the stop signal reaction time in the stop signal test (SSTSSRT) were obtained. Global graph metrics were calculated to measure the network integration, segregation, and their balance in the rs-FCN. Regional graph metrics reflecting the number of neighbors (degree centrality; DC), participation in the shortcuts (betweenness centrality; BC), and accessibility to intersections (eigenvector centrality; EC) in the rs-FCN defined group-level hubs for StD, HC, and MDD, separately. Global network metrics were comparable among the groups (all P > 0.05). Among the group-level hubs, regional graph metrics of left dorsal anterior insula (dAI), right dorsomedial prefrontal cortex (dmPFC), right rostral temporal thalamus, right precuneus, and left postcentral/middle temporal/anterior subgenual cingulate cortices were different among the groups. Further, significant associations with neurocognitive measures were found in the right dmPFC with SWMBE, and left dAI with SSTSSRT and RVPA′. Shorter OTSMLC5 was related to the lower centralities of right thalamus and suffer of recent 1-year suicidal ideation (all Ps < 0.05 in ≥ 2 centralities out of DC, BC, and EC). Collectively, salience and thalamic networks underlie spatial strategy and planning, response inhibition, and suicidality in StD and MDD. Anti-suicidal therapies targeting executive function and modulation of salience-thalamic network in StD and MDD are required.

TLDR

A systematic literature review on the relationship between servant leadership and its team and organizational level outcomes

  • Lu et al. 2023

  • Journal of Organizational Change Management

  • 2023

PurposeThis study aimed to develop an in-depth understanding of the outcomes of servant leadership at the team and organizational levels. It reviews the relationship between servant leadership and its team- and organizational-level outcomes, and examines the mediation and moderation effect of the relationship. It further identifies the mechanism by which servant leadership is beneficial to the organization.Design/methodology/approachA systematic literature review is conducted, focused on 52 articles published between 2012 and 2022. Content analysis and descriptive analysis were used to respond to the research questions.FindingsA new conceptual model was developed to better understand the outcomes, mediators and moderators of servant leadership at team and organization level.Research limitations/implicationsFuture research should further explore outcomes of servant leadership at team and organizational levels and test how mediators affect the relationship between servant leadership and associated outcomes.Practical implicationsThis study provides a framework for leaders on how servant leadership contributes to teams and organizations, and how a leader applies servant leadership.Originality/valueThis systematic review presents a new model that builds on existing research into servant leadership and its impact on team and organizational levels completed in the past decade. To date, there have been no reviews of servant leadership that focus only on outcomes at the team and organizational levels using a widely recognized database.

TLDR

The Effect of the Transformational Leadership on Proactive behavior and Change Oriented-Organization Citizen Behavior: The Mediating Role of Intrinsic motivation, Identified motivation, External motivation and Amotivation

  • Chah et al. 2023

  • Korean Academy of Organization and Management

  • 2023

Transformational leadership is about suggesting a vision for change, motivating the organization members, and considering each organization members. This study examined the effects of transformational leadership on four types of motivation, namely intrinsic motivation, identification motivation, external motivation, and amotivation, and the effects of these four types of motivation on proactive behavior and change-oriented organizational citizenship behavior. To verify the research model, we collected data from 278 employees working in Korean companies and conducted regression analysis and Bootstrapping. As a result of the empirical analysis, it is confirmed that all hypotheses of this study are supported. The theoretical implications of this study are that transformational leadership promotes employees’ proactive behavior and change-oriented organizational citizenship behavior through four types of motivation. Namely, this study reveal the mechanism by which transformational leadership affects these employees' behavior. In addition, This study suggests practical implications that leaders need to demonstrate transformational leadership in order to derive positive actions such as proactive behavior and change-oriented organizational citizenship behavior from their employees and Transformative leadership can motivate employees in organizational change so that employees can actively participate in organizational change.

TLDR

The shared and unique neural correlates of personal semantic, general semantic, and episodic memory

  • Tanguay et al. 2023

  • eLife

  • 2023

One of the most common distinctions in long-term memory is that between semantic (i.e., general world knowledge) and episodic (i.e., recollection of contextually specific events from one’s past). However, emerging cognitive neuroscience data suggest a surprisingly large overlap between the neural correlates of semantic and episodic memory. Moreover, personal semantic memories (i.e., knowledge about the self and one’s life) have been studied little and do not easily fit into the standard semantic-episodic dichotomy. Here, we used fMRI to record brain activity while 48 participants verified statements concerning general facts, autobiographical facts, repeated events, and unique events. In multivariate analysis, all four types of memory involved activity within a common network bilaterally (e.g., frontal pole, paracingulate gyrus, medial frontal cortex, middle/superior temporal gyrus, precuneus, posterior cingulate, angular gyrus) and some areas of the medial temporal lobe. Yet the four memory types differentially engaged this network, increasing in activity from general to autobiographical facts, from autobiographical facts to repeated events, and from repeated to unique events. Our data are compatible with a component process model, in which declarative memory types rely on different weightings of the same elementary processes, such as perceptual imagery, spatial features, and self-reflection.

TLDR

EXPRESS: Are latent working memory items retrieved from long-term memory?

  • Chao et al. 2023

  • Quarterly journal of experimental psychology

  • 2023

Switching one's focus of attention between to-be-remembered information in working memory (WM) is critical for cognition, but the mechanisms by which this is accomplished are unclear. Some models suggest that passively retaining "latent" information outside of focal attention and returning it to the focus involves episodic long-term memory (LTM) retrieval processes even for delays of only a few seconds. We tested this hypothesis by examining performance on both a two-item, double-retrocue WM task (that oriented participants' attention to the item that would be tested first and second on each trial) and subsequent LTM tests for the items from the initial WM task. We compared performance on these tests between older adults (a population with LTM deficits) and young adults with either full (Experiment 1) or divided (Experiment 2) attention during the WM delay periods. Retrocueing, aging, and divided attention all had significant effects on WM performance, but did not interact with or systematically affect subsequent LTM performance for item, location, or associative memory judgments made with either high or low confidence. These dissociations between WM and LTM suggest that LTM retrieval processes are not involved in retaining and reactivating an item outside of focal attention on this two-item, double-retrocue WM paradigm, which has shown neuroimaging, neurostimulation, and neurocomputational modeling evidence for latent WM; rather, the results are consistent with the Dynamic Processing Model of WM (Rose, 2020, Current Directions in Psychological Science).

TLDR

The effect of ten versus twenty minutes of mindfulness meditation on state mindfulness and affect

  • Palmer et al. 2023

  • Scientific reports

  • 2023

We aimed to elucidate the effects of “dose” of a single-session of mindfulness meditation on state mindfulness and affect as well as moderators of effects. 372 adults recruited remotely via Amazon’s MTurk platform were randomly assigned to either a: 10-min mindfulness meditation, 20-min mindfulness meditation, 10-min control, or 20-min control. Control conditions were recordings of a National Geographic article. Primary outcomes were changes in state mindfulness, anxiety, and negative and positive affect. Moderator variables included neuroticism, trait mindfulness, and prior meditation experience. Collapsing across doses, participants in mindfulness conditions reported greater increases in state mindfulness than in control conditions. There was a greater increase in state mindfulness in the 10-min mindfulness condition versus 10-min control condition. There were no differences between 10- and 20-min mindfulness conditions. Exploratory moderation analyses indicated that meditation (10 or 20) versus control (10 or 20) predicted increased state mindfulness among participants with lower trait mindfulness. Additionally, 20-min versus 10-min meditation predicted greater decreases in state anxiety among individuals with high trait mindfulness. Dose–response relationships were minimal, suggesting that 10 and 20 min of meditation may improve state mindfulness comparably. Findings support the benefits of brief mindfulness meditation and suggest that trait mindfulness moderates certain outcomes.

TLDR

Have we built machines that think like people?

  • Buschoff et al. 2023

  • ArXiv

  • 2023

A chief goal of artificial intelligence is to build machines that think like people. Yet it has been argued that deep neural network architectures fail to accomplish this. Researchers have asserted these models' limitations in the domains of causal reasoning, intuitive physics, and intuitive psychology. Yet recent advancements, namely the rise of large language models, particularly those designed for visual processing, have rekindled interest in the potential to emulate human-like cognitive abilities. This paper evaluates the current state of vision-based large language models in the domains of intuitive physics, causal reasoning, and intuitive psychology. Through a series of controlled experiments, we investigate the extent to which these modern models grasp complex physical interactions, causal relationships, and intuitive understanding of others' preferences. Our findings reveal that, while these models demonstrate a notable proficiency in processing and interpreting visual data, they still fall short of human capabilities in these areas. The models exhibit a rudimentary understanding of physical laws and causal relationships, but their performance is hindered by a lack of deeper insights-a key aspect of human cognition. Furthermore, in tasks requiring an intuitive theory of mind, the models fail altogether. Our results emphasize the need for integrating more robust mechanisms for understanding causality, physical dynamics, and social cognition into modern-day, vision-based language models, and point out the importance of cognitively-inspired benchmarks.

TLDR

The results emphasize the need for integrating more robust mechanisms for understanding causality, physical dynamics, and social cognition into modern-day, vision-based language models, and point out the importance of cognitively-inspired benchmarks.

Dissociating language and thought in large language models: a cognitive perspective

  • Mahowald et al. 2023

  • ArXiv

  • 2023

Today's large language models (LLMs) routinely generate coherent, grammatical and seemingly meaningful paragraphs of text. This achievement has led to speculation that these networks are -- or will soon become --"thinking machines", capable of performing tasks that require abstract knowledge and reasoning. Here, we review the capabilities of LLMs by considering their performance on two different aspects of language use: 'formal linguistic competence', which includes knowledge of rules and patterns of a given language, and 'functional linguistic competence', a host of cognitive abilities required for language understanding and use in the real world. Drawing on evidence from cognitive neuroscience, we show that formal competence in humans relies on specialized language processing mechanisms, whereas functional competence recruits multiple extralinguistic capacities that comprise human thought, such as formal reasoning, world knowledge, situation modeling, and social cognition. In line with this distinction, LLMs show impressive (although imperfect) performance on tasks requiring formal linguistic competence, but fail on many tests requiring functional competence. Based on this evidence, we argue that (1) contemporary LLMs should be taken seriously as models of formal linguistic skills; (2) models that master real-life language use would need to incorporate or develop not only a core language module, but also multiple non-language-specific cognitive capacities required for modeling thought. Overall, a distinction between formal and functional linguistic competence helps clarify the discourse surrounding LLMs' potential and provides a path toward building models that understand and use language in human-like ways.

TLDR

It is argued that contemporary LLMs should be taken seriously as models of formal linguistic skills and models that master real-life language use would need to incorporate or develop not only a core language module, but also multiple non-language-specific cognitive capacities required for modeling thought.

Cross-lingual Prompting: Improving Zero-shot Chain-of-Thought Reasoning across Languages

  • Qin et al. 2023

  • ArXiv

  • 2023

Chain-of-thought (CoT) is capable of eliciting models to explicitly generate reasoning paths, thus promoting reasoning accuracy and attracting increasing attention. Specifically, zero-shot CoT achieves remarkable improvements in a wide range of reasoning tasks by simply instructing the LLM with the prompt"Let's think step by step!". Despite the success of zero-shot CoT, the existing zero-shot prompting techniques remain limited to a single language, making it challenging to generalize to other languages and hindering global development. In this work, we introduce cross-lingual prompting (CLP), aiming to improve zero-shot CoT reasoning across languages. Specifically, CLP consists of two main components: (1) cross-lingual alignment prompting and (2) task-specific solver prompting. The cross-lingual alignment prompting is responsible for aligning representations across different languages, whereas the task-specific solver prompting is used to generate the final chain of thoughts and results for the reasoning task. In addition, we further introduce cross-lingual self-consistent prompting (CLSP) to ensemble different reasoning paths across languages. Our experimental evaluations on several benchmarks demonstrate that CLP and CLSP significantly outperform the existing prompting methods and achieve state-of-the-art performance. We hope this work will inspire further breakthroughs in cross-lingual CoT.

TLDR

Cross-lingual prompting (CLP) is introduced, aiming to improve zero-shot CoT reasoning across languages, and cross-lingUAL self-consistent prompting ( CLSP) to ensemble different reasoning paths across languages is introduced.

Latest News & Updates

Case Study: Iterative Design for Skimming Support

Case Study: Iterative Design for Skimming Support

How might we help researchers quickly assess the relevance of scientific literature? Take a closer look at Skimming, Semantic Reader’s latest AI feature, and the collaborative design process behind it.

Behind the Scenes of Semantic Scholar’s New Author Influence Design

Behind the Scenes of Semantic Scholar’s New Author Influence Design

We released a new version of Author Influence interface to help scholars better discover other scholars in their fields. Here's how we identified user insights and made those design choices.

Artificial-intelligence search engines wrangle academic literature

Artificial-intelligence search engines wrangle academic literature

Nature had a chat with Dan Weld, Chief Scientist at Semantic Scholar, to discuss how search engines are helping scientists explore and innovate by making it easier to draw connections from a massive collection of scientific literature.

Experience a smarter way to search and discover scholarly research.

Create Your Account