You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Answering Arabic Questions Using The Aragpt2 Transformer

-Question-answering (QA) models are natural language processing (NLP) models that can answer questions in natural language. They are trained on a large corpus of text and use this data to understand the meaning of questions and find the best answer. QA systems are widely used in applications like customer service, knowledge management, and education, providing quick and accurate answers. Techniques like natural language processing, information retrieval, and machine learning aid the process. Transformers are widely used for QA due to their contextual embeddings, pre-training, attention mechanisms, transfer learning, and performance. Transformers have demonstrated state-of-the-art performance on a wide range of QA tasks, outperforming traditional methods in terms of accuracy and speed.

Generative QA models, which generate text directly to answer questions, are a powerful approach to QA, as they can generate answers to questions not explicitly mentioned in the context, making them well-suited for open-domain tasks. Transformers have demonstrated state-of-the-art performance on a wide range of QA tasks, making them a popular choice in many applications.

image/png

AraGPT2

@inproceedings{antoun-etal-2021-aragpt2, title = "{A}ra{GPT}2: Pre-Trained Transformer for {A}rabic Language Generation", author = "Antoun, Wissam and Baly, Fady and Hajj, Hazem", booktitle = "Proceedings of the Sixth Arabic Natural Language Processing Workshop", month = apr, year = "2021", address = "Kyiv, Ukraine (Virtual)", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/2021.wanlp-1.21", pages = "196--207", }

Testing the model using transformers:

from transformers import GPT2TokenizerFast, pipeline
#for base and medium
from transformers import GPT2LMHeadModel
#for large and mega
# pip install arabert
from arabert.aragpt2.grover.modeling_gpt2 import GPT2LMHeadModel

from arabert.preprocess import ArabertPreprocessor

MODEL_NAME='IBB-University/Question_answer_model'
arabert_prep = ArabertPreprocessor(model_name=MODEL_NAME)

text=""
text_clean = arabert_prep.preprocess(text)

model = GPT2LMHeadModel.from_pretrained(MODEL_NAME)
tokenizer = GPT2TokenizerFast.from_pretrained(MODEL_NAME)
generation_pipeline = pipeline("text-generation",model=model,tokenizer=tokenizer)

#feel free to try different decoding settings
generation_pipeline(text,
    pad_token_id=tokenizer.eos_token_id,
     max_length=512,
    penalty_alpha=0.6,
    top_k=4 )[0]['generated_text']

evaluate the quality of the generated answer`:

image/png

image/png

Training_loss=0.5 , rate = 5e-5,epoch=1 Metrics value BERTScore 0.736 Levenshtein distance 0.226 BLEU 0.0099

Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support