Rogue Scholar

Large-language-modelsArtificial-intelligencePrompt-engineeringInformatique et sciences de l'informationAnglais

Prompt engineering: A Way to Smartly Use AI

Publié 14 mai 2024

Author Dhruv Gupta ( ORCID : 0009–0004–7109–5403) Introduction Large Language Models (LLMs) have become the new face of Natural language processing (NLP). With their generative power and ability to comprehend human language, the human reliance on these models is increasing every day. However, the LLMs have been known to hallucinate and thus produce wrong outputs.

Artificial IntelligenceTocInformatique et sciences de l'informationAnglais

How to use Large Language Models to tag your data: A complete tutorial

https://doi.org/10.59350/z1z3k-rrm02

Publié 12 mai 2024

Auteur Xuzeng He

Introduction Data tagging, in simple terms, is the process of assigning labels or tags to your data so that they are easier to retrieve or analyse. For example, when you are dealing with a database consisting of scientific journals, you may want to tag these documents with their relevant topics so that users can later easily find the journal they are interested in using a filter button without too much effort.

Knowledge GraphTocInformatique et sciences de l'informationAnglais

Automated Knowledge Graph Construction with Large Language Models — Part 2

https://doi.org/10.59350/4c2mx-vm853

Publié 12 mai 2024

Auteur Amanda Kau

Introduction Knowledge graphs (KGs) are a structured representation of data in a graphical format, in which entities are represented by nodes and are connected by edges representing relationships between them. They have been employed across numerous domains, like retail, healthcare, and search engines. However, one critical factor limiting the usage of KGs is the difficult and costly

MegalodonLong-textsTransformer-architectureInformatique et sciences de l'informationAnglais

The longer the context, the better? Unlimited Context Length in Megalodon

https://doi.org/10.59350/dx6a6-yy475

Publié 7 mai 2024

Auteur Qingqin Fang

An improvement architecture superior to the Transformer, proposed by Meta Author · Qingqin Fang ( ORCID: 0009–0003–5348–4264) Introduction Recently, researchers from Meta and the University of Southern California have introduced a model called Megalodon. They claim that this model can expand the context window of language models to handle millions of tokens without overwhelming your memory.

Large-language-modelsArtificial-intelligenceTransformersNatural-language-processInformatique et sciences de l'informationAnglais

Brief Introduction to the History of Large Language Models (LLMs)

https://doi.org/10.59350/m4c7t-epg97

Publié 7 mai 2024

Auteur Wenyi Pi

Understanding the Evolutionary Journey of LLMs Author Wenyi Pi ( ORCID : 0009–0002–2884–2771) Introduction When we talk about large language models (LLMs), we are actually referring to a type of advanced software that can communicate in a human-like manner. These models have the amazing ability to understand complex contexts and generate content that is coherent and has a human feel.

Artificial IntelligenceTocAnglais

Brief Introduction to the History of Large Language Models (LLMs)

Publié 7 mai 2024

Auteur Wenyi Pi

Introduction When we talk about large language models (LLMs), we are actually referring to a type of advanced software that can communicate in a human-like manner. These models have the amazing ability to understand complex contexts and generate content that is coherent and has a human feel.

Natural-language-processiTransformersArtificial-intelligenceInformatique et sciences de l'informationAnglais

Transformers Models in NLP

https://doi.org/10.59350/c7nrg-xay43

Publié 7 mai 2024

Auteur Dhruv Gupta

Attention mechanism not getting enough attention Author Dhruv Gupta ( ORCID : 0009–0004–7109–5403) Introduction As discussed in this article, RNNs were incapable of learning long-term dependencies. To solve this issue both LSTMs and GRUs were introduced. However, even though LSTMs and GRUs did a fairly decent job for textual data they did not perform well.

Artificial IntelligenceTocInformatique et sciences de l'informationAnglais

Three Paradigms of RAG

https://doi.org/10.59350/5j7tt-5y328

Publié 6 mai 2024

Auteur Vaibhav Khobragade

Introduction Large Language Models (LLMs) have achieved remarkable success. But, they still face significant limitations, especially in domain-specific or knowledge-intensive tasks such as question answering, producing “hallucinations” where the models generate responses that sound plausible but are actually incorrect when handling queries beyond their training data or requiring current information.

Artificial IntelligenceTocInformatique et sciences de l'informationAnglais

Fine-tuning Large Language Models: A Brief Introduction

https://doi.org/10.59350/1aezq-kk827

Publié 6 mai 2024

Auteur Xuzeng He

Introduction Large Language Models (LLMs), usually trained with extensive text data, can demonstrate remarkable capabilities in handling various tasks with state-of-the-art performance. However, people nowadays typically want something more personalised instead of a general solution. For example, one may want LLMs to assist in code writing while the other may seek models that are specialised in medical knowledge.

Artificial IntelligenceTocInformatique et sciences de l'informationAnglais

Are Large Language Models Our Allies or Enemies in the Fight Against Fake News?

https://doi.org/10.59350/st0jr-ad818

Publié 6 mai 2024

Auteur Amanda Kau

Introduction In recent years, fake news has become an increasing concern for many, and for good reason. Newspapers, which we once trusted to deliver credible news through accountable journalists, are vanishing en masse along with their writers. Updates about events and happenings around the world spread faster through social media than journalists can report, and the Internet’s nature is that anyone can post anything.

NaturallanguageprocessingLstmArtificial-intelligenceRecurrent-neural-networkInformatique et sciences de l'informationAnglais

RNNs vs GRUs vs LSTMs

https://doi.org/10.59350/t6mga-7zd77

Publié 30 avril 2024

Auteur Dhruv Gupta

The Three Oldest Pillars of NLP Author Dhruv Gupta ( ORCID : 0009–0004–7109–5403) Introduction Natural Language Processing (NLP) has almost become synonymous with Large Language Models (LLMs), Generative AI, and fancy chatbots. With the ever-increasing amount of textual data and exponential growth in computational knowledge, these models are improving every day.

Research Graph

Prompt engineering: A Way to Smartly Use AI

How to use Large Language Models to tag your data: A complete tutorial

Automated Knowledge Graph Construction with Large Language Models — Part 2

The longer the context, the better? Unlimited Context Length in Megalodon

Brief Introduction to the History of Large Language Models (LLMs)

Brief Introduction to the History of Large Language Models (LLMs)

Transformers Models in NLP

Three Paradigms of RAG

Fine-tuning Large Language Models: A Brief Introduction

Are Large Language Models Our Allies or Enemies in the Fight Against Fake News?

RNNs vs GRUs vs LSTMs